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MODULATION OF GENE EXPRESSION USING 
5 LOCALIZATION DOMAINS 

TECHNICAL FIELD 
The present disclosure is in the field of gene regulation, specifically, using 
compositions containing localization domain polypeptides, or functional Augments 
10 thereof, to modulate gene expression. 



BACKGROUND 

The development of an organism and ultimate function of any given cell in 
that organism depends on the particular set of genes being expressed (j^.g., transcribed 

15 and translated) in the cell. Since virtually all the genes in the human genome have 
now been sequenced, the challenge now is to understand the molecular mechanisms 
that allow these genes to be selectively expressed. 

In vertebrates, DNA methylation of CpG dinucleotides has long been 
identified as an important mechanism of development DNA methylation is required 

20 for normal development (Ohki et al (1999) EMBO J 18:6653-6661; Okano et al. 
(1999) Cell 99:247-257); is correlated with genomic inqjrinting (Ashbumcr (1972) 
Results Probl Cell Differ 4:101-151; Grunstcin et al. (1997) Nature 389:349-352) and 
is involved in X-chromosome inactivation (Heard et al. (1 997) Annual Rev Genet 
31:571-610). A large body of evidence indicates that cytosine methylation leads to 

25 the assembly of a speciatized, heritable,-rq)ressive chromatin architecture through the 
recruitment of histone deacetylases (Bird and WolfiFe (1999) Cell 99:451-454; 
Siegfiried et al. (1997) CurrBiol 7:R305-307). However, the precise role of DNA 
methylation in tissue specific regulation of non-imprinted genes remains contentious 
(Bird (1997) Trends Genet 13:469-472). 

30 Thus, DNA methylation appears to be critical in vertebrate development, 

which relies upon the imposition of progressively more stable states of transcriptional 
repression (Steinbach et al. (1997) Nature 389:395-399; Mannervik et al. (1999) 
Science 284:606-609). Further, DNA methylation may play a role in partitioning the 
genome, and the chromosomal infirastructure within which it is packaged, into active 
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and inactive intranuclear compartments (Bird et al. (1995) Trend Genet. 1 1:94-99). 
For example, mouse primordial germ cells, embryonic stem cells and the cells of the 
blastocyst can progress through the cell cycle and divide without detectable DNA 
methylation (Lei et al. {1396) Development 122:3195-3205). Once differentiation 
5 begins, however, DNA methylation becomes essential for individual cell viability (Li 
et al. (1992) Cell 69:915-926; Okano et al. (1999) Cell 99:247-257). 

DNA methylation has also been implicated in clinical disease states. Parasitic 
DNA, eg., retrotransposons, retrovirus genomes, lentivirus genomes, LI elements and 
Alu elements are known to be CpG rich. It has been proposed that DNA methylation 

10 may have arisen as a genome-defense system to silence expression of these parasitic 
elements and limit their spread Oirough the genome (Yoder et al." (1997) Trend Genet, 
13:335-340; Colot et al. (1999) BioEssays 21:402-41 1). AdditionaUy, several genetic 
diseases have been described that cause methylation defects, including the ICF 
syndrome (Xu et al. (1999) Nature 402:187-189), Rett syndrome (Amir et al. (1999) 

15 Nature Genet. 23:185-188) and fragile X syndrome (Oberle et al. (1991) Science 
252:1711-1714). 

Cellular DNA methylation patterns seem to be established by a complex 
interplay of at least three independait DNA methyltransferases: DNMTl, DNMT3A 
and DNMT3B (Kaludov and Wolffe (2000) Nuc Acids Res 28:1921-1928, and 

20 references cited therein). Methyltransferases are required for de novo methylation 
that occurs in the genome following embryo implantation and for the de novo 
methylation of newly integrated retroviral sequences in mouse ES cells (Okano et al. 
(1999) Cell 99:247-257). Proteins having signijHcant homology to vertebrate 
methyltransferases been identified in zebrafish, Arabidopsis thaliana and maize 

25 (Okano et al. (1998) Nature Genet 19:219-220; Cao et al. (2000) PNAS USA 97:4979- 
4984). 

In addition to the methyltransferases, a group of proteins which bind to 
methylated CpG sequences have also been identified. The melhyl-CpG-binding 
jprotein MECP2 has been most characterized. MECP2 has been shown to selectively 
30 reocgnize methylated DNA and to repress transcription in methylated regions of the 
genome (Lewis et al. (1992) Cell 69:905-914). MECP2 contains at least two 
domains: the methyl-CpG-binding domain (MBD), which recognizes symmetrically- 
methylated CpG dinucleotides through contacts in the major groove of the double 
helix (Wakefield et al. (1999) JMol Biol 291:1055-1065) and a transcriptional 
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repression domain (TRD), which interacts with several other regulatory proteins (Nan 
et al. (1997) Cell 88:471-481. MECP2 selectively represses transcription of 
methylated templates in the absence of an organized chromatin structure and, when 
tethered to a specific heterologous Gal4-binding domain, its TRD confers 
5 transcriptional repression by interacting with TFIBB, a component of the basal 

transcription machinery (Kaludov et and Wolffe, (2000) Nucleic Acids Res. 28:1921- 
1928). Methyl binding domain proteins associate with corepressor complexes that 
include histone deacetylases. Methyl CpG binding proteins have also been shown to 
be components of chromatin-remodeling complexes, for example the MECP2 

1 0 repressor complex. Recruitment of a histone deacetylase occurs indirectly through its 
interaction with the Sin3A adaptor proteins, which causes transcriptional silencing, in 
part by deacetylation of histones, directing Ihe formation of stable repressive 
chromatin structures. 

Thus, methylation of DNA can repress transcription through multiple 

1 5 mechanisms (see, e.g., Kaludov and Wolffe (2000) Nuc Acids Res 28:1921-1928, and 
references cited therein). Pathways of repression include direct inhibition of 
transcription through the failure of transcription factors to associate with methylated 
recognition elements (Iguchi-Arigan et al. (1989) Genes Dev. 3:612-619) and indirect 
pathways involving either occlusion of methylated sequences by transcriptional 

20 repressors that recognize methylated DNA (Meehan et al. (1 992) Nucleic Adds Res. 
20:5085-5092) or the modification of chromatin structure targeted by methyl-CpG- 
specific transcriptional repressors (Buschhausen et al. (1987) PNAS USA 84:1 177- 
1181; Kass et al. (1997) Cwrr. Biol 7:157-165). ' 

Despite the characterization of the functional properties of methyl-CpG- 

25 specific binding proteins and their constituent MBDs, it has not heretofore been 
possible to target the various fiinctional activities of MBDs, for use in specific and 
directed modulation of gene expression. 

SUMMARY 

30 In one aspect, methods of compartmentahzing a region of interest in cellular 

chromatin are provided. The methods comprise contacting the region of interest with 
a composition that binds to a binding site in cellular chromatin, wherein the binding 
site is in a gene of interest and wherein the composition comprises a localization 
domain or fimctional fragment thereof, and a DNA binding domain or functional 
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fragment thereof In certain embodiments, the composition is a fusion molecule, for 
example a fusion polypeptide. In other embodiments, the region of interest is 
compartmentalized into a nuclear compartment for packaging as heterochromatin. 
The methods are useful in a variety of cells, including but not limited to, plant cells 
5 and animal cells (e.g., human). The localization domain can be a methyl CpG binding 
domain obtained, for example, from MECP2, MBDl, MBD2, MBD3, dMBD-like and 
dMBD-likeA, or one or more functional fragments thereof The DNA-binding 
domain can be, for example, a zinc finger protein or a triplex-forming nucleic acid.or 
a minor groove binder. In certain embodiments, any of the methods described herein 

10 facilitate modulation of expression of a gene associated with the region of interest, for 
example repression of the gene. In other embodiments, the methods described herein 
further comprise the step of contacting a cell with a polynucleotide encoding a fusion 
polypeptide, wherein the fusion polypeptide is expressed in flie cell. The gene can 
encode any product, for example, vascular endothelial growth factor, erythropoietin, 

15 androgen receptor, PPAR-^, pi 6, p53, pRb, dystrophin and e-cadherin. Furthermore, 
in other embodiments, the region of interest is involved in disease states selected from 
fee group consisting of ICF syndrome, Rett syndrome and Fragile X syndrome. 

In anofeo- aspect, mefeods are provided for modulation of gene expression, 
wherein fee mefeods comprise fee step of contacting a region of DNA in cellular 

20 chromatin wife a fusion molecule feat binds to a binding site in cellular chromatin, 
wherein the binding site is in fee gene and wherein fee fusion molecule comprises a 
DNA binding domain and a localization domain, for example, a mefeyl CpG binding 
domain. Modulation of fee gene can be, for example, repression of fee target gene. 
The DNA-binding domain of fee fusion molecule can be, for example, a zinc finger 

25 DNA-binding domain. Furfeer, fee DNA binding domain can bind to a variety of 

target sites, for example to a target site in a gene encoding a product selected from fee 
group consisting of vascular endofeelial growfe factor, erythropoietin, androgen 
recq)tor, PPAR-y2, pl6, p53, pRb, dystrophin and e-cadherin. The localization 
domain can be a mefeyl CpG binding domain obtained from, for example, MECP2, 

30 MBD 1 , MBD2, MBD3 , dMBD-Iike and dMBD-likeA, or one or more fimctional 
firagments feereof. In still ftnfeer embodiments, fee methods involve contacting 
cellular chromatin wife a plurality of fusion molecules. 
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In other aspects, methods of modulating gene expression are provided, 
wherein the methods comprise tiie step of contacting a region of DNA in cellular 
chromatin with a fusion molecule that binds to a binding site in cellular chromatin, 
wherein the binding site is m the gene and wherein the fusion molecule comprises a 
5 DNA bindmg domain, a localization domain such as, for example, a methyl CpG 
binding domain and a regulatory domain (such fusion molecules can include 
functional fragments of any of these domains). Modulation of gene expression can 
be, for example, repression (e.g., using a repression domain or functional fragment 
thereof as the transcriptional regulatory domain) or activation (e.g, using an activation 

10 domain, such as for example VP 1 6, or a functional fragment thereof, as the 

transcriptional regulatory domain). The regulatory domain can also comprise a 
component of a chromatin remodeling complex (or a functional fragment thereof) 
with the capacity to recruit complexes capable of remodeling chromatin of the target 
gene into either a transcriptionally active or a transcriptionally inactive state, as 

15 desired. The DNA-binding domain of the fusion molecule can comprise a zinc finger 
DNA-binding domain. Further, the DNA binding domain can bind to any target site, 
for example a target site in a gene encoding a product selected from the group 
consisting of vascular endothelial growth factor, erythropoietin, androgen receptor, 
PPAR-y2, pl6, p53, pRb, dystrophin and e-cadherin. The localization domain can be 

20 a methyl CpG binding domain obtained, for example, from MECP2, MBDl, MBD2, 
MBD3, dMBD-like and dMBD-likeA, or one or more functional fragments thereof 
In still further embodiments, a plurality of fusion molecules is contacted with cellular 
chromatin, wherein each of the fusion molecules binds to a distinct binding site, for 
example, to modulate expression of one or more genes. 

25 In yet another aspect, a fusion polypeptides comprising a localization domain 

or functional fragment thereof; and a DNA binding domain or a functional fragment 
thereof is provided. In certain embodiments, the fusion polypeptide also comprises a 
regulatory domain, for example an activation domain (e.g., VP- 16, p65), a repression 
domain (e.g., KRAB, v-erbA) or a component of a chromatin remodeling complex. 

30 Any of the polypeptides described herem can include a DNA-binding domain which 
is a zinc finger DNA binding domain and a localization domain which can be, for 
example, a methyl CpG binding domain such as obtained, for example, from MECP2, 
MBDl, MBD2, MBD3, dMBD-like and dMBD-likeA or functional fragments hereof. 
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These fusion polypeptides can bind, for example, to a target site in a gene encoding a 
product selected from the group consisting of vascular endotheHal growth factor, 
erythropoietin, androgen receptor, PPAR-y2, pl6, p53, pRb, dystrophin and e- 
cadherin. Polynucleotides encoding any of the fusion polypeptides described herein 
5 are also provided, as are cells comprising the polypeptides and/or polynucleotides 
encoding the polypeptides. 

These and other embodiments will be readily apparent to one of skill in the art 
upon reading the present disclosure. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

Figures lA and IB, are sequence alignments depicting that Drosophila 
contains multiple proteins with significant similarity to vertebrate methyl CpG 
binding proteins. 

Figure 1 A depicts the similarity of Drosophila proteins to the methyl CpG 

IS binding domain motif. The amino acid sequences corresponding to the methyl CpG 
binding motif of human MeCP2, human MBDI, human MBD4, human MBD2, and 
Xenopus MBD3 (xMBD3) are aligned with the corresponding sequences from the 
indicated Drosophila gene products. A 23 amino acid segment from the Drosophila 
MBD-like sequence (NNNASSNNNSSATASSNNNNNKV, SEQ ID NO: 1) has been 

20 omitted in the loop LI to facilitate the alignment. Positions of beta strands, loops« and 
the alpha helix defined by the solution structures of MeCP2 and MBDI are indicated 
above the alignment Residues boxed in the alignment are identical or similar in all or 
all but one the sequences depicted. Residues indicated by the symbol \|/ define 
hydrophobic residues crucial for the basic fold of the motif. Residues indicated by the 

25 squares constitute the basic patch on one surface of the wedge structure. The two 
residues indicated by the diamond symbols are conserved hydrophobic residues 
critical for the structure of the hairpin loop. 

Figure IB depicts the similarity of dMBD-like to Xenopus MBD3. The 
deduced amino acid sequence of Drosophila MBD-like and MBD-likeA are aligned 

30 with Xenopus MBD3 and MBD3-LF. Amino acids identical in all the proteins are 
shaded in dark gray, amino acids with sinailar side chain chemistry are shaded light 
gray and indicated by upward arrows. A box indicates the methyl CpG binding 
domains. The secondary structure of the methyl-CpG binding domain is indicated at 
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the top. Arrows represent P-sheet segments, rectangles represent a-helices. Loops 
appear as thick lines. 

Figure IC is an immunoblot depicting that immobilized dMBD-like fails to 
bind methylated DNA. The bottom two panels depict Southwestern assays performed 
5 with recombinant X. laevis MBD3, Drosophila MBD-like and MBD-likeA. The 
middle panel (labeled GAC12) is probed with the unmethylated DNA probe. The 
lower panel (labeled GAM12) is probed with flie methylated probe. The top panel 
(labeled Coomassie) is a Coomassie Blue stained gel of lanes identical to those in the 
middle and lower panels. Each panel contains triplicate samples of the indicated 
10 protein. 

Figure ID is an immunoblot depicting that dMBD-like fails to bind 
methylated DNA in solution. Xenopus MBD3, Drosophila MBD-like and MBD- 
likeA were examined for the ability to bind to methylated (GAM12) or unmethylated 
(GAC12) DNA probes. Binding reactions were performed as described in Example 1 . 

15 Lanes 1-5 of each gel contain radiolabelled^ unmethylated GAC12 as a probe and 

lanes 6-10 contain radiolabelled, fully methylated GAM12. For each gel, lanes 1 and 
6 contain only the probe without any added protein. Lanes 2 and 7 contain 50 ng of 
protein, lanes 3 and 8 contain 75 ng of protein and lanes 4, 5, 9 and 10 contain 150 ng 
of protein. Binding was competed with either GAC12 or GAM12 as competitor (U, 

20 . unmethylated GAC12; M, methylated GAM 1 2) as indicated at the bottom of the 
- figure. 

Figure 2A is an immunoblot showing that dMBD-likeA is the predominant 
form of dMBD-like protein found in Drosophila S2 cells. The immunoblot was 
prepared and analyzed with a-dMBD-like serum as described in Example 1. The 

25 lanes were loaded as follows: Lane 1, 5 ^1 S2 nuclear extract, Lane 2, 10 jil nuclear 
extract, Lane 3, recombinant dMBD-hTce, Lane 4, recombinant dMBD-likeA. 

Figure 2B shows association of dMBD-likeA with histone deacetylase activity 
in S2 nuclear extracts, fimnunoprecipitations were performed as described in 
Example 1 on S2 nuclear extracts using the a-dMBD-like antiserum or pre-immune 

30 serum from the same rabbit. Precipitates were analyzed for HDAC activity using the 
deacetylase assay described in Example 1. Acetate released is indicated in the bar 
graph as cpm tritium. Samples are as follows: 1, no antiserum control; 2, pre-immune 
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scrum; 3 a-dMBD-likc scruin. Precipitations were performed multiple times; a 
representative example is depicted. 

Figure 2C shows association of dMBD-like with nucleosome-stimulated 
ATPase activity. Immunoprecipitations were performed on S2 nuclear extracts as 
5 described in Figure 2B. Precipitated proteins were analyzed for ATPase activity as 
described in Example 1 . The bar graph depicts inorganic phosphate produced in 
arbitrary units. Samples are as follows: 1, no antiserum control; 2, pre-immune 
serum; 3, a-dMBD-like serum. Light and dark bars correspond respectively to the 
absence and the presence of chicken erythrocyte mononucleosomes during the 

10 ATPase assay. 

Figure 3A is a schematic depicting partial resolution of dMBD-likcA from 
SIN3 and RPD3 by ion exchange chromatography. S2 nuclear extract was 
fractionated according to the scheme depicted in Fig. 3A and described in detail in 
Example 1 . HDAC activity assays and immunoblot analysis of the indicated fractions 

15 fix)m the MonoQ column are shown below the flow chart. Figure 3B is an 
immunoblot depicting coelution of dMBD-likeA with components of the Mi-2 
complex on a gel filtration column. Fraction 24 from the MonoQ colimm was 
resolved on a Superose 6 gel filtration column as described in Example 1. Indicated 
fractions were analyzed by immunoblot using the antisera are indicated. 

20 Figure 4A shows schematic depictions of the plasmids used for Ihe 

transfection assays. A description of plasmid construction is presented in Example 1. 

Figure 4B depicts transcriptional repression as a function of dose of Gal4- 
tefhered dMBD-hke, dMBD-likeA, and Groucho. Experiments were performed in 
triplicate and error bars are shown. 

25 Figure 4C is an immunoblot showing that expression of the transiently 

transfected Gal4 derivatives is equivalent Extracts from cells transfected with each 
of the indicated constructs were analyzed by immunoblot using either a-dMBD-like 
or a-Gal4 antisera. 

Figure 4D shows that TSA relieves repression by Gal4-Gro, Gal4-dMBD-like 
30 and Gal4-dMBD-likeA. The graph depicts luciferase activity from the GsDEstkLuc 
reporter driven by the indicated Gal4 derivatives as a percentage of luciferase 
expression from the same reporter in the absence of any transfected GaM protein. All 
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the samples are tiie average of triplicates. TSA was used at 100 nM and 400 nM as 
indicated in the figure. 

Figures 5 A and 5B show regulation dMBD-like and dMBD-likeA mRNA and 
protein expression during development. 
5 Figure 5A is a Northern (RNA) blot showing dMBD-like and dMBD-likeA 

expression through development Total RNA isolated from various developmental 
stages (~10 |ag/lane) was fractionated on a formaldehyde-agarose gel and transferred 
to a nylon membrane as described in Example 1. Lanes 1-3, embryonic stages: 0-3 h, 
3-12 h, 12-24 h; lanes 4-6, larval stages, 1'', 2"^ and 3"^ Instars. Lane 7 male adult 
10 flies, lane 8, female adult flies. 

Figure 5B is an immunoblot showing dMBD-like and dMBD-hkeA levels 
during development. Lanes 1-8 correspond to the same samples as in panel A. 
Equivalent amounts of protein were loaded in each lane. 

15 DETAILED DESCRIPTION 

Disclosed herein are compositions containing localization domains and 
methods for their preparation and use. The mefliods and compositions allow, for 
example, localization of corepression complexes either (1) to facilitate their 
recruitment to particular sites within chromatin by fusion of a localization domian to a 

20 DNA binding domain that can access such a site to repress gene activity or (2) to 

interfere with corepressive function, for example by attaching an activation domain to 
a DNA binding domain-localization domain fusion to affect repressive influences and 
promote gene activation. 

In a preferred embodiment, a localization domain is a methyl binding domain 

25 (MBD) or a functional fragment thereof. Vertebrate methyl binding domain proteins 
are known to recognize and bind to CpG dinucleotide sequences in which the C 
residue is methylated. However, a surprising and unexpected ability of MBDs 
(including invertebrate MBDs which do not bind to methylated DNA) is flieir capacity 
to localize DNA, for example in corepression complexes. Thus, the methods and 

30 compositions disclosed herein allow for modulation of gene expression by employing 
a composition comprising a localization domain polypeptide or ftmctional fragment 
thereof. The localization domain polypeptides can be selected for their ability to 
affect transcription, for example via their capacity to interact with corepression 
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complexes and/or facilitate compartmentalization of target sequences in repressive 
compartments of the nucleus. 

In one aspect, compositions and methods useful in modulating expression of a 
target gene are provided. The compositions typically comprise a fiision molecule 
5 comprising a localization domain and a DNA-binding domain. In one preferred 
embodiment, the localization domain comprises a MBD (or functional fragment 
thereof) and the DNA binding domain comprises a zinc finger protein (ZFP) or 
functional fragment thereof. In still furflier aspects, the compositions further 
comprise a transcriptional regulatory domain (a "functional domain"), for example an 

10 activation or repression domain. 

Thus, it will be apparent to one of skill in the art that the use of localization 
domain(s) or functional fragments thereof will facilitate the regulation of many 
processes involving gene expression including, but not limited to, replication, 
recombination, repair, transcription, telomere function and maintenance, sister 

15 chromatid cohesion, mitotic chromosome segregation and, in addition, binding of 
transcription factors. 

General 

The practice of the disclosed methods, and the uses of the disclosed 
20 compositions, employ, unless otherwise indicated, conventional techniques in 

molecular biology, biochemistry, chromatin structure and analysis, computational 
chemistry, cell culture, recombinant DNA and related fields as are within the skill of 
the art These techniques are fully explained in the literature. See^ for example, 
Sambrook et al MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, 
25 Cold Spring Harbor Laboratory Press, 1989; Sambrook et al MOLECULAR CLONING: 
A LABORATORY MANUAL, Third edition, Cold Spring Harbor Laboratory Press, 2001; 
Ausubcl et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 
New York, 1987 and periodic updates; the series METHODS IN ENZYMOLOGY, 
Academic Press, San Diego; WolfFe, CHROMATIN STRUCTURE AND FUNCTION, Third 
30 edition. Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, 
"Chromatin" (P.M. Wassarman and A. P. Wolffe, eds.), Academic Press, San Diego, 
1999; and METHODS IN MOLECULAR BIOLOGY, Vol 1 19, "Chromatin Protocols" 
(P.B. Becker, ed.) Humana Press, Totowa, 1999. 
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The terms "nucleic acid," "polynucleotide," and "oligonucleotide" are used 
interchangeably and refer to a deoxyribonucleotide or ribonucleotide polymer in either 
single- or double-stranded form. For the purposes of the present discloswe, these terms 
are not to be construed as limiting with respect to the length of a polymer. The terms can 
S encompass known analogues of natural nucleotides, as well as nucleotides Oiat are 
modified in the base, sugar and/or phosphate moieties. In general, an analogue of a 
particular nucleotide has the same base-pairing specificity; Le,, an analogue of A will base- 
pair with T. The terms also encompasses nucleic acids containing modified backbone 
residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, 

10 which have similar binding properties as the reference nucleic acid, and which are 

metabolized in a manner similar to the reference nucleotides. Examples of such analogs 
include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, 
chiral-mefhyl phosphonates, 2-0-methyl ribonucleotides, peptide-nucleic acids (PNAs). 
Unless otherwise indicated, a particular nucleic acid sequence also implicitly 

1 5 encompasses conservatively modified variants thereof (e.g. , degenerate codon 
substitutions) and complementary sequences, as well as the sequence explicitly 
indicated. Nucleic acids include, for example, genes, cDNAs, and mRNAs. 
Polynucleotide sequences are displayed herem in the conventional 5 *-3' orientation. 
Chromatin is the nucleoprotein structure comprising the cellular genome. 

20 "Cellular chromatin" comprises nucleic acid, primarily DNA, and protein, including 
histones and non-histone chromosomal proteins. The majority of eukaryotic cellular 
chromatin exists in the form of nucleosomes, wherein a nucleosome core comprises 
approximately 150 base pairs of DNA associated with an octamer comprising two 
each of histones H2A, H2B, H3 and H4; and linker DNA (of variable lengfli 

25 depending on the organism) extends between nucleosome cores. A molecule of 
histone HI (or its equivalent) is generally associated with the linker DNA. For the 
purposes of the present disclosure, the term "chromatin" is meant to encompass all 
types of cellular nucleoprotein, both prokaryotic and eukaryotic. Cellular chromatin 
includes both chromosomal and episomal chromatin, and includes both 

30 transcriptionally active chromatin (euchromatin) and transcriptionally inactive 
chromatin (heterochromatin). 

A "chromosome" is a chromatin complex comprising all or a portion of the 
genome of a cell. The genome of a cell is often characterized by its karyotype, which 
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is the collection of all the chroTnosomes that comprise the genome of the cell. The 
genome of a cell can comprise one or more chromosomes. 

An "episome" is a replicating nucleic acid, nucleoprotcin complex or other 
structure comprising a nucleic acid that is not part of the chromosomal karyotype of a 
5 cell. Examples of episomes include plasmids and certain viral genomes. 

An "exogenous molecule" is a molecule that is not normally present in a cell, 
but can be introduced into a cell by one or more genetic, biochemical or other 
methods. Normal presence in the cell is determined with respect to the particular 
• developmental stage and environmental conditions of the cell. Thus, for example, a 
10 molecule that is present only during embryonic development of muscle is an 
exogenous molecule with respect to an adult muscle cell. Similarly, a molecule 
induced by heat shock is an exogenous molecule with respect to a non-heat-shocked 
cell. An exogenous molecule can comprise, for example, a functioning version of a 
malfunctioning endogenous molecule or a malfunctioning version of a normally- 
1 S functioning endogenous molecule. 

An exogenous molecule can be, among oflier things, a small molecule, such as 
is generated by a combinatorial chemisiry process, or a macromolecule such as a 
protein, nucleic acid, carbohydrate, lipid, glycoprotein, lipoprotien, polysaccharide, 
any modified derivative of the above molecules, or any complex comprising one or 
20 more of the above molecules. Nucleic acids include DNA and RNA, can be single- or 
double-stranded; can be linear, branched or circular; and can be of any length. 
Nucleic acids include those capable of forming duplexes, as well as triplex-forming 
nucleic acids. See, for example, U.S. Patent Nos. 5,176,996 and 5,422,251. Proteins 
include, but are not limited to, DNA-bindtng proteins, transcription factors, chromatin 
25 remodeling fectors, methylated DNA binding proteins, polymerases, methylases, 
demethylasesj acetylases, deacetylases, kinases, phosphatases, integrases, 
recombinases, ligases, topoisomerases, gyrases and helicases. 

An exogenous molecule can be the same type of molecule as an endogenous 
molecule, e.g., protein or nucleic acid {Le.^ an exogenous gene), providing it has a 
30 sequence that is different from an endogenous molecule. For example, an exogenous 
nucleic acid can comprise an infecting viral genome, a plasmid or episome introduced 
. into a cell, or a chromosome that is not normally present in the cell. Methods for the 
introduction of exogenous molecules into cells are known to those of skill in the art 
and include, but are not limited to, lipid-mediated transfer (i.e., Uposomes, including 
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neutral and cationic lipids), electroporation, direct injection, cell fusion, particle 
bombardment, calcium phosphate co-precipitation, DEAE-dextran-mediated transfer 
and viral vector-mediated transfer. 

By contrast, an "endogenous molecule" is one that is normally present in a 
5 particular cell at a particular developmental stage under particular environmental 
conditions. For exanotple, an endogenous nucleic acid can comprise an endogenous 
gene, a chromosome, the genome of a mitochondrion, chloroplast or other organelle, 
or a naturally-occurring episomal nucleic acid. Additional endogenous molecules can 
include proteins, for example, transcription factors and components of chromatin 

10 remodeling complexes. 

A "fusion molecule" is a molecule in which two or more subunit molecules 
are linked, preferably covalently. The subunit molecules can be the same chemical 
type of molecule, or can be dififerent chemical types of molecules. Examples of the 
first type of fusion molecule include, but are not limited to, fusion polypeptides (for 

15 example, a fusion between a ZFP DNA-binding domain and a methyl binding 

domain) and fusion nucleic acids (for example, a nucleic acid encoding tlie fusion 
polypeptide described supra). Examples of the second type of fusion molecule 
include, but are not limited to, a fusion between a triplex-forming nucleic acid and a 
polypeptide, and a fusion between a minor groove binder and a nucleic acid. 

20 A "gene,** for the purposes of the present disclosure, includes a DNA region 

encoding a gene product (see injrd)^ as well as all DNA regions which regulate the 
production of the gene product, whether or not such regulatory sequences are adjacent 
to coding and/or transcribed sequences. Accordingly, a gene includes, but is not 
necessarily limited to, promoter sequences, terminators, translational regulatory 

25 sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, 
silencers, insulators, boundary elements, replication origins, matrix attachment sites 
and locus control regions. 

"Gene expression" refers to the conversion of the information, contained in a 
gene, into a gene product. A gene product can be the direct transcriptional product of 

30 a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any 
other type of RNA) or a protein produced by translation of a mRNA. Gene products 
also include RNAs which are modified, by processes such as capping, 
polyadenylation, methylation, and editing, and proteins modified by, for example. 
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methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, 
myristilation, and glycosylation. 

"Gene activation" and "augmentation of gene expression" refer to any process 
which results in an increase in production of a gene product A gene product can be 
5 either RNA (including, but not limited to, mRNA, iRNA, tRNA, and structural RNA) 
or protein. Accordingly, gene activation includes those processes which increase 
transcription of a gene and/or translation of a tnRNA. Examples of gene activation 
processes which increase transcription include, but are not limited to, those which 
facilitate formation of a transcription initiation complex, those which remodel 

1 0 chromatin into an active state, those which increase transcription initiation rate, those 
which increase transcription elongation rate, those which increase processivity of 
transcription and those which relieve transcriptional repression (by, for example, 
blocking the binding of a transcriptional repressor). Gene activation can constitute, 
for example, inhibition of repression as well as stimulation of expression above an 

1 5 existing level. Examples of gene activation processes which increase translation 

include those which increase translational initiation, fiiose which increase translational 
elongation and those which increase mRNA stability. In general, gene activation 
comprises any detectable increase in the production of a gene product, preferably an 
increase in production of a gene product by about 2-fold, more preferably from about 

20 2- to about 5-fold or any integer therebetween, more preferably between about 5- and 
about 10-fold or any integer therebetween, more preferably between about 10- and 
about 20-fold or any integer therebetween, still more preferably between about 20- 
and about 50-fold or any integer therebetween, more preferably between about 50- 
and about 100-fold or any integer tiierebetween, more preferably 1 00-fold or more. 

25 "Gene repression" and "inhibition of gene expression" refer to any process 

which results in a decrease in production of a gene product A gene product can be 
either RNA (including, but not limited to, mRNA, rRNA, tRNA, and structural RNA) 
or protein. Accordingly, gene repression includes those processes which decrease 
transcription of a gene and/or translation of a mRNA. Examples of gene repression 

30 processes which decrease transcription include, but are not limited to, those which 

inhibit formation of a transcription initiation complex, those which remodel chromatin 
into an inactive state, those which decrease transcription initiation rate, those which 
decrease transcription elongation rate, those which decrease processivity of 
transcription and those which antagonize transcriptional activation (by, for example, 
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blocking the binding of a transcriptional activator). Gene repression can constitute, 
for example, prevention of activation as well as inhibition of expression below an 
existing level. Examples of gene repression processes which decrease translation 
include feose which decrease translational initiation, those which decrease 
5 translational elongation and those which decrease mRNA stability. Transcriptional 
repression includes both reversible and irreversible inactivation of gene transcription. 
In general, gene repression comprises any detectable decrease in the production of a 
gene product, preferably a decrease in production of a gene product by about 2-fold, 
more preferably from about 2- to about 5-fold or any integer therebetween, more 

10 preferably between about 5- and about 10-fold or any integer therebetween, more 
preferably between about 10- and about 20-fold or any integer therebetween, still 
more preferably between about 20- and about 50-fold or any integer therebetween, 
more preferably between about 50- and about 100-fold or any integer therebetween, 
more preferably 100-fold or more. Most preferably, gene repression results in 

1 5 complete inhibition of gene expression, such that no gene product is detectable. 

"Eucaryotic cells" include, but are not limited to, fungal cells (such as yeast), 
plant cells, insect cells, animal cells, teleost cells, mammalian cells and human cells. 

The terms "operable linkage," "operably linked," "operative linkage" and 
"operatively linked" are used with reference to a juxtaposition of two or more con]5>onents 

20 (such as sequence elements), in which the coniponents are placed into a functional 

relationship with one anotiier. Thus operatively linked components are arranged such that 
at least one of the components can mediate a function that is exerted upon at least one of 
flie other components. By way of illustration, at transcriptional regulatory sequence, such 
as a promoter, is operatively linked to a coding sequence if flie transcriptional regulatory 

25 sequence controls tlie level of transcription of the coding sequence in response to the 
presence or absence of one or more transcriptional regulatory factors. An operatively 
linked transcriptional regulatory sequence is generally joined in cis with a coding 
sequence, but need not be directly adjacent to it. For example, an enhancer can constitute 
a transcriptional regulatory sequence that is operatively-linked to a coding sequence, even 

30 though they are not contiguous. Similarly, certain amino acid sequences that are non- 
contiguous in a primary polypeptide sequence may nonetheless be.operably linked due to, 
for example folding of a polypeptide chain. 

With respect to fusion polypeptides, tiie terms *'operably linked" and 
"operatively linked" can refer to the fact that each of the components performs the 
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same ftinction in linkage to the other component as it would if it were not so linked. 
For example, with respect to a fusion polypeptide in which a ZFP DNA -binding 
domain is fused to a transcriptional activation domain (or fimctional fragment 
thereof, the ZFP DNA-binding domain and the transcriptional activation domain (or 
5 functional fragment thereof) are in operative linkage if, in the fusion polypeptide, the 
ZFP DNA-binding domain portion is able to bind its target site and/or its binding site, 
while the transcriptional activation domain (or functional fragment thereof) is able to 
activate transcription. 

A "functional fragment" of a protein, polypeptide or nucleic acid is a protein, 

10 polypeptide or nucleic acid whose sequence is not identical to its native or full-length 
counterpart, yet retains tiie same function as the native or full-length counterpart. A 
functional fragment can possess more, fewer, or the same number of residues as the 
corresponding native or full-length molecule, and/or can contain one ore more ammo 
acid or nucleotide analogues or substitutions. Methods for determining the function 

15 of a nucleic acid {e.g. , coding function, ability to hybridize to another nucleic acid) 
are well-known in the art. Similarly, methods for determining protein function are 
well-known. For example, the DNA-binding function of a polypeptide can be 
determined, far example, by filter-binding, electrophoretic mobiUty-shift, or 
immunoprecipitation assays. See Ausubel et aL, supra. The ability of a protein to 

20 interact wilh another protein can be determined, for example, by co- 

immunoprecipitation, two-hybrid assays or complementation, both genetic and 
biochemical. See, for example. Fields et al (1989) Nature 340:245-246; U.S. Patent 
No. 5,585,245 andPCT WO 98/44350. 

The term "recombinant," when used with reference to a cell, indicates that the 

25 cell replicates an exogenous nucleic acid, or expresses a peptide or protein encoded by 
an exogenous nucleic acid. Recombinant cells can contain genes that are not found 
within the native (non-recombinant) form of the cell. Recombinant cells can also 
contain genes fr>und in the native form of the cell wherein the genes are modified and 
re-introduced into the cell by artificial means. The term also encompasses cells that 

30 contain a nucleic acid endogenous to the cell that has been modified without 

removing the nucleic acid from the cell; such modifications include those obtained by 
gene replacement, site-specific mutation, and related techniques. Thus, for example, 
recombinant cells express genes that are not found within the native (naturally 
occurring) form of the cell or express a second copy of a native gene that is otherwise 
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normally or abnormally expressed, underexpressed or not expressed at all. 
Recombinant cells also include cells or cell lines derived from cells that have been 
modified as described. 

The term "recombinanf * when used with reference, e.g., to a nucleic acid, 
5 protein, or vector, refers to nucleic acids, proteins or vectors that have been modified 
by the introduction of heterologous nucleic acid or amino acid sequence, and includes 
any other alterations of a native nucleic acid or protein. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 
synthetically, with a series of specified nucleic acid elements that permit transcription 

10 of a particular nucleic acid in a host cell, and optionally integration and/or repUcation 
of the expression vector in a host cell. The expression vector can be part of a plasmid, 
viral genome, or nucleic acid fiBgment, of viral or non-viral origin. Expression 
vectors can be, for example, naked DNA molecules, or can comprise nucleic acid of 
viral or nonviral origin packaged into viral particles. Typically, the expression vector 

15 includes an "expression cassette," which comprises a nucleic acid to be trajiscribed 
operably linked to control elements that are capable of effecting expression of a 
nucleic acid that is operatively linked to tiie control elements in hosts compatible with 
such sequences. Expression cassettes include at least promoters and optionally, 
transcription termination signals. Typically, a recombinant expression cassette 

20 includes at least a nucleic acid to be transcribed (e.g., a nucleic acid encodmg a 
desired polypeptide) and a promoter. Additional factors necessary or helpful in 
effecting expression can also be used, for exanople, an expression cassette can also 
include nucleotide sequences that encode a signal sequence that directs secretion of an 
expressed protein from the host cell. Transcription termination signals, enhancers, 

2S and other nucleic acid sequences that influence gene expression can also be included 
in an expression cassette. 

The term "naturally occurring," as applied to an object, means that the object 
can be found m nature. 

The terms "polypeptide," "peptide" and "protein" are used mterchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid 
polymers in which one or more amino acid residue is an analog or mimetic of a 
corresponding naturally occurring amino acid, as well as to naturally occurring amino 
acid polymers. Polypeptides can be modified, e.g., by phosphorylation, methylation, 
myristilation, acelylation and/or the addition of carbohydrate residues to form 
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glycoproteins. The terms "polypeptide," **peptide" and "protein" include all of these 
modified polypeptides, as well as polypeptides comprising any additional covalent or 
non-covalent modification. Polypeptide sequences are displayed herein in the 
conventional N-terminal to C-terminal orientation. 
5 A "subsequence" or "segment" when used in reference to a nucleic acid or 

polypeptide refers to a sequence of nucleotides or amino acids that comprise a part of 
a longer sequence of nucleotides or amino acids (e.g., a polypeptide), respectively. 

The term "antibody" as used herein includes antibodies obtained from both 
polyclonal and monoclonal preparations, as well as, the following: (i) hybrid 

10 (chimeric) antibody molecules (see, for example, Winter et al (1991) Nature 

349:293-299; and U.S. Patent No. 4,816,567); (ii) F(ab*)2 and F(ab) fragments; (iii) 
Fv molecules (noncovalent heterodimers, see, for example, Inbar et aL (1972) Proc. 
Natl. Acad. Sci. USA 69:2659-2662; and Ehrlich et al. (1980) Biochem 19:4091- 
4096); (iv) single-chain Fv molecules (sFv) (see, for example, Huston et al. (1988) 

15 Proc. Natl. Acad. Sci. USA 85:5879-5883); (v) dimeric and trimeric antibody 
fragment constructs; (vi) humanized antibody molecules (see, for example, 
Riechmann et al, (1988) Nature 332:323-327; Verhoeyan et al, (1988) Science 
239:1534-1536; and U.K. Patent Publication No. GB 2,276,169, published 21 
September 1994); (vii) Mini-antibodies or minibodies (i.e., sFv polypeptide chains 

20 that include oligomerization domains at their C-termini, separated from the sFv by a 
hinge region; see, e.g., Pack etal. (1992) Biochem 31:1579-1584; Cumber etal. 
(1992) J. Immunology 149B:120-126); and, (vii) any ftmctional fragments obtained 
from such molecules, wherein such fragments retain specific-binding properties of the 
parent antibody molecule. 

25 "Specific binding" between an antibody or other binding agent and an antigen, 

or between two binding partners, means that the dissociation constant for the 
interaction is less than 10"^ M. Preferred antibody/antigen or binding partner 
complexes have a dissociation constant of less than about 10*^ M, and preferably 10"^ 
M to 10"? M or 10'1<^ M or lower. 

30 A "binding protein" "or binding domain" is a protein or polypeptide that is able to 

bind non-covalently to another molecule. A binding protein can bind to, for example, a 
DNA molecule (a DNA-binding domain), an KNA molecule (an RNA-binding domain) 
and/or a protein molecule (a protein-binding domain). In the case of a protein-binding 
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protein, it can bind to itself (to form homodimers, homotrimers, etc.) and/or it can bind to 
one or more molecules of a different protein or proteins. A binding domain can have more 
than one type of binding activity. For example, zinc finger proteins have DNA-binding, 
RNA-binding and protein-binding activity. 
5 A "zinc finger binding protein" is a protein or polypeptide that binds DNA, RNA 

and/or protein, preferably in a sequence-specific manner, as a result of stabilization of 
protein structure through coordination of a zinc ion. The term zinc finger binding protein 
is often abbreviated as zinc finger protein or ZFP, The individual DNA binding domains 
are typically referred to as "fingers" A ZFP has least one finger, typically two fingers, 

10 three fingers, or six fingers. Each finger binds from two to four base pairs of DNA, 

typically three or four base pairs of DNA. A ZFP binds to a nucleic acid sequence called a 
target site or target segment Each finger typically comprises an approximately 30 amino 
acid, zinc-chelating, DNA-binding subdomain. An exemplary motif characterizing one 
class of these proteins (C2H2 class) is -Cys-pC)2-4-Cys-pC)i2-'His-(X)3,5-His (where X is 

1 5 any amino acid). Studies have demonstrated that a single zinc finger of this class consists 
of an alpha helix containing the two invariant histidine residues co-ordinated with zinc 
along with the two cysteine residues of a single beta turn (see, e.g., Berg & Shi, Science 
271:108M085 (1996)). 

Zinc finger proteins can be engineered to bind to predetermined sequences. 

20 Examples of zinc finger engineering include designed zinc finger proteins and selected 
zinc finger proteins. A "designed" zinc finger protein is a protein not occurring in nature 
whose structure and composition result principally from rational criteria. Rational criteria 
for design include application of substitution rules and computerized algorithms for 
processing information in a database storing information of existing ZFP designs and 

25 binding data, for example as described in PCT WO 98/53058, WO 98/53059, WO 

99/53060 and WO 00/42219. A "selected" zinc finger protein is a protein not found in 
nature whose production results primarily from an empirical process such as phage 
display. See US 5,789,538; US 6,007,988; US 6,013,453; WO 95/19431; 
WO 96/06166 WO 98/53057 and WO 98/54311. 

30 A "target site" or "target sequence" is a sequence that is bound by a binding protein 

such as, for example, a ZFP. Target sequences can be nucleotide sequences (either DNA 
or RNA) or amino acid sequenpes. A single target site typically has about four to about 
ten base pairs, but can be as long as 18-20 base pairs, e.g,y for a six-finger ZFP. Typically, 
a two-fingered ZFP recognizes a four to seven base pair target site, and a three-fmgered 
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ZFP recognizes a six to ten base pair target site. By way of example, a DNA target 
sequence for a three-finger ZFP is generally either 9 or 10 nucleotides in length, depending 
upon the presence and/or nature of cross-strand interactions between the ZFP and the 
target sequence. Target sequences can be found in any DNA or RNA sequence, including 
5 regulatory sequences, exons, introns, or any non-coding sequence. 

A "target subsite" or "subsite" is the portion of a DNA target site that is bound by a 
single zinc finger, excluding cross-strand intwactions. Thus, in the absence of cross-strand 
interactions, a subsite is generally three nucleotides in length. In cases in which a cross- 
strand interaction occurs (eg., a "D-able subsite," as described for example in co-owne<i 

1 0 PCT WO 00/422 19) a subsite is four nucleotides in laigth and overlaps with another 3- or 
4-nucleotide subsite. 

"Kd" refers to the dissociation constant for the compound, i.e., the 
concentration of a compound (e.g., a zinc finger protein) that gives half maximal 
binding of the compound to its target (i.e., half of the compound molecules are bound ' 

15 to the target) under given conditions (i.e., when [target] « Kd), as measured using a 
given assay system (see, e.g., U.S. Patent No. 5,789,538). Any assay system can be 
used, as long is it gives an accurate measuitement of the actual Kd. In one 
embodiment, the Ka for a ZFP is measured using an electrophoretic mobility shift 
assay ("EMSA"), as described, for example, in WO 00/441566 and WO 00/42219. 

20 "Administering'* an expression vector, nucleic acid, ZFP, or a delivery vehicle 

to a cell comprises transducing, transfecting, electroporating, translocating, fusing, 
phagocytosing, shooting or ballistic methods, etc., i.e., any means by which a protein 
or nucleic acid can be transported across a cell membrane and preferably into the 
nucleus of a cell 

25 The term "effective amount" includes that amount which results in the desired 

result, for example, repression of an active gene, activation of a repressed gene, or 
inhibition of transcription of a structural gene or translation of RNA. 

A "delivery vehicle" refers to a compound, e.g., a liposome, toxin, or a 
membrane translocation polypeptide, which is used to administer an exogenous 

30 molecule. Delivery vehicles can be used, for example, to administer nucleic acids 
encoding fusion molecules such as, for example ZFP-localization domain fusions. 
Exemplary delivery vehicles include lipid:nucleic acid complexes, expression vectors, 
viruses, and the like. 
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The term "modulate" refers to a change in the quantity, degree or extent of a 
function. For example, the modified zinc finger-nucleotide binding polypeptides disclosed 
herein may modulate the activity of a promoter sequence by binding to a motif within the 
promoter, thereby inducing, enhancing or suppressing transcription of a gene operatively 
5 linked to the promoter sequence. Alternatively, modulation may include inhibition of 
transcription of a gene wherein the modified zinc finger-nucleotide binding polypeptide 
binds to the structural gene and blocks DNA dependent RNA polymerase from reading 
through the gene, thus inhibiting transcription of the gene. The structural gene may be a 
normal cellular gene or an oncogene, for example. Alternatively, modulation may include 

10 inhibition of translation of a transcript. Thus, "modulation" of gene expression includes 
both gene activation and gene repression. 

Modulation can be assayed by determining any parameter that is indirectly or 
■ directly affected by the expression of the target gene. Such parameters include, e.g., 
changes in RNA or protein levels; changes in protein activity; changes in product levels; 

15 changes in downstream gene expression; changes in transcription or activity of reporter 
genes such as, for example, luciferase, CAT, beta-galactosidase, or GFP (see, e.g., Mistili 
& Spector, (1991) Nature Biotechnology 15:961-964); changes in signal transduction; 
changes in phosphorylation and dephosphorylation; changes in receptor-ligand 
interactions; changes in concentrations of second messengers such as, for example, cGMP, 

20 cAMP, IP3, and Ca2*; changes in cell growth, changes in neovascularization, and/or 

changes in any functional effect of gene expression. Measurements can be made in vitro, 
in vivo, and/or ex vivo. Such functional effects can be measured by conventional methods, 
e.g., measurement of RNA or protein levels, measurement of RNA stability, and/or 
identification of downstream or reporter gene expression. Readout can be by way of, for 

25 example, chemiluminescence, fluorescence, colorimetric reactions, antibody binding, 

inducible markers, ligandbindmg assays; changes in intracellular second messengers such 
as cGMP and inositol triphosphate {IP3); changes in intracellular calcium levels; cytokine 
release, and the like. 

Accordingly, the terms "modulating expression** "inhibiting expression*' and 

30 "activating expression" of a gene can refer to the ability of a molecule to activate or inhibit 
transcription of a gene. Activation includes prevention of transcriptional inhibition (i.e., 
prevention of repression of gene expression) and inhibition includes prevention of 
transcriptional activation (i.e., prevention of gene activation). 
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To detennme the level of gene expression modulation by a ZFP, cells 
contacted with ZFPs are compared to control cells, e.g., without the zinc finger 
protein or with a non-specific ZFP, to examine the extent of inhibition or activation. 
Control samples are assigned a relative gene expression activity value of 100%. 
5 Modulation/inhibition of gene expression is achieved when the gene expression 
activity value relative to the control is about 80%, preferably 50% (i.e., 0.5x the 
activity of the control), more preferably 25%, more preferably 5-0%. 
Modulation/activation of gene expression is achieved when the gene expression 
activity value relative to the control is 110% , more preferably 150% (i.e., 1.5x the 
10 activity of the control), more preferably 200-500%, more preferably 1000-2000% or 
more. 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription. As used herein, a promoter typically includes necessary nucleic 
acid sequences near the start site of transcription, such as, in the case of certain RNA 

1 5 polymerase H type promoters, a TATA element, enhancer, CCAAT box, SP-1 site, 
etc. As used herein, a promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start 
site of transcription. The promoters often have an element that is responsive to 
transactivation by a DNA-binding moiety such as a polypeptide, e.g., a nuclear 

20 receptor, Gal4, the lac repressor and the like. 

A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter 
that is active under certain environmental or developmental conditions. 

A "regulatory domain" or "functional domain" refers to a protein or a 

25 polypeptide sequence (or portion thereof) that has transcriptional modulation activity, 
or that is capable of interacting with proteins and/or protein domains that have 
transcriptional modulation activity. Such proteins include, e.g., transcription factors 
and co-factors (e.g., KRAB, MAD, ERD, SDD, nuclear factor kappa B subunitp65, 
early growth response factor 1, and nuclear hormone receptors, VP16, VP64), 

30 endonucleases, integrases, recombinases, methyltransferases, histone 

acetyltransferases, histone deacetylases and polypeptides which are components of a 
chromatin remodeling complex, and their functional fragments. Exemplary 
components of chromatin remodeling complexes are disclosed in co-owned 
PCT/USOl/40616. A functional domain can be covalently or non-covalently linked to 
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a DNA-binding domain (eg., a ZFP) to modulate transcription of a gene of interest 
Alternatively, some binding domains, such as for example ZFPs can act in the 
absence of a functional domain to modulate transcription. Furthermore, transcription 
of a gene of interest can be modulated by a binding domain, such as a ZFP, linked to 
S multiple functional domains. 

The term "heterologous" is a relative term, which when used wifli reference to 
portions of a nucleic acid indicates that the nucleic acid comprises two or more 
subsequences that are not found in the same relationship to each other in nature. For 
instance, a nucleic acid that is recombinantly produced typically has two or more 

10 sequences from unrelated genes synthetically arranged to make a new functional 
nucleic acid, e.g., a promoter from one source and a coding region from another 
source. The two nucleic acids are thus heterologous to each other in this context 
When added to a cell, the recombinant nucleic acids would also be heterologous to the 
endogenous genes of the cell. Thus, in a cell, a heterologous nucleic acid would 

15 include a recombinant nucleic acid that has integrated into the chromosome, or a 
recombinant extrachromosomal nucleic acid. 

SimUarly, a heterologous protein indicates that the protein comprises two or 
more subsequences that are not found in the same relationship to each other in nature 
(e.g., a "fusion protein," where the two subsequences are encoded by a single nucleic 

20 acid sequence). See, e.g., Ausubel, supra^ for an introduction to recombinant 
techniques. 

By "host cell" is meant a cell that contains one or more exogenous molecules such 
as, for example, expression vectors and/or heterologous nucleic acids. The host cell 
t3rpically supports the replication or expression of an expression vector. Host cells may be 

25 prokaiyotic cells such as E. coli, or eukaryotic cells such as fungal cells (e.g., yeast), 
protozoal cells, plant cells, insect cells, animal cells, avian cells, teleost cells, amphibian 
cells, mammalian cells, primate cells or human cells. Exemplary mammalian cell lines 
include CHO, HeLa, 293, COS-1, and the hke, e.g., cultured cells (in vitro), cxplants and 
primary cultures (in vitro and ex vivo\ and cells in vivo, 

30 The term "amino acid" refers to naturally occurring and synthetic amino 

acids, as well as amino acid analogs and amino acid mimetics that function in a 
manner similar to the naturally occurring amino acids. Naturally occurring amino 
acids are those encoded by the genetic code, as well as those amino acids that are later 
modified, e.g., hydroxyproline, carboxyglutamate, and 0-phosphoserine. Amino acid 
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analogs refers to compounds that have flie same basic chemical structure as a 
naturally occurring amino acid, i.e., an a carbon that is bound to a hydrogen, a 
carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, 
methionine sulfoxide, methionine, and methyl sulfonium. Such analogs have 
5 modified R groups (e.g., norleucine) or modified peptide backbones, but retain the 
same basic chemical structure as a naturally occurring amino acid. Amino acid 
mimetics refers to chemical compounds that have a stracture that is different from the 
general chemical structure of an amino acid, but that functions in a manner similar to 
a naturally occurring amino acid. 

10 "Conservatively modified variants" applies to both amino acid and nucleic 

acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to fliose nucleic acids which encode identical or essentially 
identical amino acid sequences, or where the nucleic acid does not encode an amino 
acid sequence, to essentially identical sequences. Specifically, degenerate codon 

15 substitutions may be achieved by generating sequences in which the third position of 
one or more selected (or all) codons is substituted with mixed-base and/or 
deoxyinosine residues (Batzer a/.. Nucleic Acid Res, 19:5081 (1991); Ohtsukae/ 
al, 7. Biol Chem, 260:2605-2608 (1985); Rossolini etal, Mol Cell Probes 8:91-98 
(1994)). Because of the degeneracy of the genetic code, a large number of 

20 functionally identical nucleic acids encode any given protein. For instance, the 

codons GCA, GCC, GCG and GCU all encode tlie amino acid alanine. Thus, at every 
position where an alanine is specified by a codon in an amino acid herein, the codon 
can be altered to any of the corresponding codons described without altering the 
encoded polypeptide. Such nucleic acid variations are "silent variations," which are 

25 one species of conservatively modified variations. Every nucleic acid sequence 

herein which encodes a polypeptide also describes every possible silent variation of 
the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except 
AUG, which is ordinarily the only codon for methionine, and TGG, which is 
ordinarily the only codon for tryptophan) can be modified to yield a functionally 

30 identical molecule. Accordingly, each silent variation of a nucleic acid which 
encodes a polypeptide is implicit in each described sequence. 

As to amino acid and nucleic acid sequences, individual substitutions, 
deletions or additions that alter, add or delete a single amino acid or nucleotide or a 
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small percentage of amino acids or nucleotides in the sequence create a 
"conservatively modified variant," where the alteration results in the substitution of an 
amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. Such 
5 conservatively modified variants are in addition to and do not exclude polymorphic 
variants and alleles. See, e.g., Creighton, Proteins (1984) for a discussion of amino 
acid properties. 

Localization Domains 

10 Transcriptionally inactive regions of chromatin (e.g,, telomeres, 

heterochromatin, matrix attachment regions, scaffold attachment regions, 
centromeres) have been observed to occupy distinct nuclear addresses. See, for 
example, Cockell et al. (1999) Curr. Opin. Genet. Devel. 9: 199-205; Mahy et al. 
(2000) in "Chromatin Structure and Gene Expression," Second Edition (S.C.R. Elgin 

15 & JX. Workman, eds.) Oxford University Press, Oxford. Pp.300-321 and references 
therein. Thus, there exists, in at least some cases, a correlation between 
transcriptional activity and nuclear localieatioh. Moreover, certain nuclear proteins 
have been observed to be localized to specific regions within the nucleus. For 
example, the HP 1 protein is localized to regions of the nucleus that are rich in 

20 transcriptionally inactive heterochromatin. See, for example, Eissenberg et al. (2000) 
Curr. Opin. Genet Devel. 10:204-210. This heterochromatic localization of HPl is 
mediated, at least in part, by a region of the HPl protein known as the chromodomain. 
See, for example, Platero et al (1995) EMBO 7. 14:3977-3986. One property of HPl- 
type chromodomains is their ability to bind to histone H3 that is methylated at 

25 tysine 9. See, for example, Lachner et al. (2001) Nature 410: 1 16-120. Thus, 

exemplary localization domains include HPl and the chromodomain, which is also 
found in a number of other proteins in addition to HPl . 

Additional examples of correlations between intranuclear localization and 
transcriptional regulatory activity are provided by certain proteins involved in 

30 generating and recognizing methylated chromosomal DNA. Methylation of cytosine 
within CpG dinucleotide sequences in chromosomal DNA often leads to 
transcriptional repression of genes associated with these methylated sequences. Two 
types of proteins are directly involved witii CpG methylation: DNA-N-methyl 
transferases (DNMTs), which catalyze the methylation reaction, and methylated DNA 



25 



wo 02/26960 



PCT/USOI/42377 



binding proteins (known as MBDs because they possess a methylated DNA binding 
domain), which bind to methylated DNA and mediate certain transcriptional effects of 
DNA methylation. Both of these classes of proteins possess transcriptional regulatory 
activities in addition to their methylation, or methylated DNA-binding, activities. . 
5 These additional activities are related to the ability of these proteins to recruit 
transcriptional regulatory and chromatin remodeling proteins and/or to localize to 
discrete nuclear compartments, thereby drawing bound DNA into the compartment in 
which the protein is locaUzed. 

Accordingly, additional exemplary locahzation domains include DNMTs and 
1 0 methylated DNA-bindmg domains (MBDs). 

A. DNA-N-methvl transferases 

The DNA methyltransferases Dnmt3a and Dnmt3b are responsible for 
cytosme methylation of CpG dinucleotide sequences. CpG methylation is often 
associated with transcriptional repression, especially in the context of CpG islands 

15 located at or near the promoter of many mammalian genes. However, DNA 

methyltransferases also possess transcriptional repression activity that is independent 
of their ability to mcthylate DNA and which involves association with histone 
dcacetylases (HDACs). See, e.g., Rountree et al (200Q) Nature Genet 25:269277; 
Robertson et al. (2000) Nature Genet 25:338-342; Fuks et al (2001) EMBO / 

20 20:2536-2544. DNA methyltransferases arc also able to localize to heterochromatic 
regions of the nucleus; this locaUzing ability resides in the N-terminal region of these 
proteins. See, for example, Bachman et al (2001) J. Biol Chem. 276:32,282-32,287. 
* Thus, the transcriptional repression activity of DNMTs and related proteins is due, at 
least in part, to their ability to recruit HDACs and to localize DNA sequences to 

25 which they are bound to heterochromatic regions of the nucleus. 

Accordingly, a DNMT, or functional fragment thereof, can serve as a 
localization domain in the practice of the disclosed methods and the use of the 
disclosed compositions. Exemplary DNMTs include, but are not limited to, DNMTl, 
DNMT2, DNMT3a, and DNMT3b. See also Robertson (2001) Oncogene 20:3139- 

30 3155. 

B. Methyl Binding Domains 

In vertebrates, methyl-CpG-binding domain proteins comprise two functional 
domains: one which binds to methylated CpG dinucleotides and one which appears to 
be involved in transcriptional silencing. It is known that components of certain 
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chromatin remodeling complexes bind to methylated DNA. Chromatin remodeling 
complexes from human (NRD/NURD complex) and amphibian cells (Mi-2 complex) 
contain a nucleosome-dependent ATPase activity called Mi-2 (also known as CHD). 
Additional protein components of the amphibian Mi-2 complex include Mtal-like (a 
5 DNA-binding protein homologous to metastasis-associated protein), RPD3 (the 
amphibian homologue of histone deacetylases HDACl and HDAC2), RbAp48 (a 
protein which interacts with histone H4), and MBD3 (a protein containing a 
methylated CpG binding domain). The amphibian complex additionally contains a 
serine- and proline-rich subunit, p66. Activities of the amphibian Mi-2 complex 

10 include a nucleosome-dependent ATPase that is not stimulated by free histones or 

DNA, translational movement of histone octamers relative to DNA, and deacctylation 
of core histones within a nucleosome. Guschin et al (2000) Biochemistry 39:5238- 
5245; Wade et al. (1999) Nature Genet. 23:62-66. 

As described in the Examples below, Applicants have identified a structural 

1 5 motif in invertebrates (which lack DNA methylation) that is homologous to the 

vertebrate MBD and is a component of a Mi-2-like complex. The results described 
herein indicate that these MBDs fulfill additional ftmctions besides binding 
methylated DNA. For example, invertebrate MBDs appear to be included in a 
chromatin remodeling complex (the Mi-2 complex, see Samples) and are also able to 

20 repress transcription when fused to the Gal4 DNA binding domain (see Examples). 
Thus, the term methyl CpG binding domain or "MBD" as used herein refers to 
polypeptides sharing the id^tificd structural motif and functions (ag., as components 
of chromatin remodelmg complexes; as agents of transcriptional repression, 
corepressors, etc.). Accordingly, MBDs may, but need not, bind to methylated CpG 

25 residues. 

The methylated DNA-binding proteins MBD2 and MBD3 have been shown to 
localize to heterochroraatic regions of the nucleus. Hendrich et al. (1998) MoL Cell, 
Biol 18:6538-6547. Additional proteins which possess the ability to localize to 
heterochromatin include HPl and DNA-N-methyl transferases (see supra), 
30 Accordingly, in one embodiment, the compositions and methods described herein are 
directed to using a locahzation domain to facilitate the recruitment of corepression 
complexes to a particular site within chromatin, by fusion of the localization domain 
to a DNA binding domain that can access such a site, thereby repressing gene activity. 
In other aspects, the locahzation domain is used to interfere with corepression 
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complexes and function by constructing fusion molecules containing a localization 
domain, a DNA binding domain and one or more regulatory domains that influence 
gene expression (e.g., activation domains, repression domains and/or components of a 
chromatin remodeling complex). 
5 Any MBD having the requisite function and specificity is suitable. Thus, the 

MBD can be from any species. In certain embodiments, the MBD is derived from 
Drosophila MBD family members, for example dMBD-like and dMBD-likcA proteins 
described in the Examples. In other embodiments, the MBD is derived from 
vertebrate (e.g., mammalian) MBD proteins, for example, MBDl, MBD2, MBD3, 

10 MBD4, MeCPl and MeCP2. See, for example, Bird et al (1999) Cell 99:451-454. 
To give but one example, the methylated DNA-binding protein MeCP2 comprises 
identifiable transcriptional repression functions and methylated DNA-binding 
functions, and localizes to heterochromatin. Nan et al. (1993) Nucleic Acids Res. 
21:4886-4892, Accordingly, these regions of MeCP2 can be used as localization 

15 domains. 

It will be clear from the disclosure that the term "localization domain,'* as used 
herein, refers to a molecule capable, either actively or passively, of taking up a 
particular intranuclear address, such address often constituting a nuclear compartment 
having specific characteristics related to transcriptional activity. The term is to be 

20 distinguished from the terms "nuclear localization sequence" and **nuclear 
localization signal" which refer to sequences responsible for transport of a 
polypeptide from the c3rtoplasm into tiie nucleus. 

For the purposes of this disclosure, it is intended that the term "localization 
domain" additionally encompass those proteins or polypeptides, or functional 

25 fragments thereof, that associate or interact with a protein or protein domain capable 
of being localized. For example, the KRAB transcription regulatory domain interacts 
with flie KAP-1 protein, which, in turn, intwacts with HPl, which is localized to 
heterochromatin (s^ supra). Matsudae/a/. (2001) J. BioL Chem. 276:14,222- 
14,229. Accordingly, proteins such as KAP-1 and KRAB, as well as any other 

30 proteins capable of being localized, either intrinsically or through association with one 
or more additional proteins, can serve as a localization domain. 
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DNA-Binding domains 

In certain embodiments, the compositions and methods disclosed herein 
involve fusions between a DNA-binding domain and a localization domain. In 
additional embodiments, the compositions and methods disclosed herein involve 
5 fusions between a DNA-binding domain and a domain which participates in 
modulation of gene expression .(f:^., a regulatory domain) such as, for example a 
transcriptional activation domain, a transcriptional repression domain or a component 
of a chromatin remodeling complex. A DNA-binding domain can comprise any 
molecular entity capable of sequence-specific binding to chromosomal DNA. 

10 Binding can be mediated by electrostatic interactions, hydrophobic interactions, or 
any other type of chemical interaction. Examples of moieties which can comprise 
part of a DNA-binding domain include, but are not limited to, minor groove binders, 
major groove binders, antibiotics, intercalating agents, peptides, polypeptides, 
oligonucleotides, and nucleic acids. An example of a DNA-binding nucleic acid is a 

1 5 triplex-forming oligonucleotide. 

Minor groove binders include substances which, by virtue of their steric and/or 
electrostatic properties, interact preferentially with the minor groove of double- 
stranded nucleic acids. Certain minor groove binders exhibit a preference for 
particular sequence compositioris. For instance, netropsin, distamycin and CC-1065 

20 are examples of minor groove binders which bmd specifically to AT-rich sequences, 
particularly runs of A or T. WO 96/32496. 

Many antibiotics are known to exert their effects by binding to DNA. Binding 
of antibiotics to DNA is often sequence-specific or exhibits sequence preferences. 
Actinomycin, for instance, is a relatively GC-specific DNA binding agent. 

25 In a preferred embodiment, a DNA-binding domain is a polypeptide. Certain 

peptide and polypeptide sequences bind to double-stranded DNA in a sequence- 
specific manner. For example, certain transcription fectors participate in transcription 
initiation by RNA Polymerase II through sequence-specific interactions with DNA in 
the promoter and/or enhancer regions of genes. Defined regions within the 

30 polypeptide sequence of various transcription factors have been shown to be 

responsible for sequence-specific binding to DNA. See, for example, Pabo et al 
(1992) Ann. Rev. Biochem. 61:1053-1095 and references cited therein. These regions 
include, but are not limited to, motifs known as leucine zippers, helix-loop-helix 
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(HLH) doinains, helix-tura-helix domains, zinc fingers, P-sheet motifs, steroid 
receptor motifs, bZIP domains homeodomains, AT-hooks and otiiers. The amino acid 
sequences of these motifs are known and, in some cases, amino acids that are critical 
for sequence specificity have been identified. Polypeptides involved in other process 
5 involving DNA, such as replication, recombination and repair, will also have regions 
involved in specific interactions with DNA. Peptide sequences involved in specific 
DNA recognition, such as those found in proteins involved in transcription, 
replication, recombination and repair, can be obtained through recombinant DNA 
cloning and expression techniques or by chemical synthesis, and can be attached to 

1 0 other components of a fusion molecule by methods known in the art 

In a more preferred embodiment, a DNA-binding domain comprises a zinc 
finger DNA-binding domain (ZFP). See, for example. Miller et al (1985) EMBO 7. 
4:1609-1614; Rhodes fit (1993) .S'cienft:^c^;«encatnFeb.:56-65; and Klug (1999) 
J. Mol Biol 293:215-218. In one embodiment, a target site for a zinc finger DNA- 

15 binding domain is identified according to site selection rules disclosed in co-owned 
WO 00/42219. ZFP DNA-binding domains are designed and/or selected to recognize 
a particular target site as described in co-owned WO 00/42219 and WO 00/41566; as 
well as U.S. Patents 5,789,538; 6,007,408; and 6,013,453; and PCX publications 
WO 95/19431, WO 98/53057, WO 98/53058, WO 98/53059, WO 98/53060, 

20 WO 98/54311, WO 00/23464 and WO 00/27878. 

Certain DNA-bindmg domains are capable of binding to DNA that is 
packaged in nucleosomes. See, for example, Cordingley et ai (1987) Cell 48:261- 
270; Pina e/ a/. (1990) Ce/Z 60:7 19-731; and CiriUo e^aZ. (1998) EAfflO 7. 17:244- 
254. Certain ZFP-containing proteins such as, for example, members of the nuclear 

25 hormone receptor superfamily, are capable of binding DNA sequences packaged into 
chromatin. These include, but are not limited to, the glucocorticoid receptor and the 
thyroid hormone receptor. Archer et ai (1992) Science 255: 1573-1576; Wong et ai 
(1997) EMBO J, 16:7130-7145. Other DNA-binding domains, including certain ZFP- 
containing binding domains, require more accessible DNA for binding. In the latter 

30 case, the binding specificity of the DNA-binding domain can be determined by 

identifying accessible regions in the cellular chromatin. Accessible regions can be 
determined as described, for example, in co-owned PCT/USOl/13631 and 
PCT/USOl/40617. A DNA-binding domain is then designed and/or selected to bind 
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to a target site within the accessible region. 
Fusion Molecules 

The discovery that localization domains are involved in transcriptional 
5 corepression complexes in different vertebrate and invertebrate species also allows for 
the design of fusion molecules which facilitate regulation of gene expression. Thus, 
in certain embodiments, the compositions and mefliods disclosed herein involve 
fusions between a DNA-binding domain and a localization domain (such as, for 
example, a MBD) or functional fragment, as described supra, or a polynucleotide 

10 encoding such a fusion. In this way, a localization domain is brought into proximity 
with a sequence in a gene that is bound by the DNA-binding domain. The 
transcriptional repression function of the localization domain is then able to act on the 
gene, by recruiting additional corepressors and/or by transporting the bound gene to a 
repressive compartment of the nucleus. 

15 In additional embodiments, target remodeling of chromatin, as disclosed in co- 

owned PCT/USOl/40606 can be used to generate one or more sites in cellular 
chromatin that are accessible to fee binding of a localization domain/DNA binding 
domain fusion molecule. 

Fusion molecules are constructed by methods of cloning and biochemical 

20 conjugation that are well-known to those of skiU in the art. Fusion molecules 

comprise a DNA-binding domain and a localization domain or a functional fragment 
thereof. In certain embodiments, fusion molecules comprise a DNA-binding domain, 
a localization domain, and a regulatory domain (e.g., a transcriptional activation or 
repression domain or a component of a chromatin remodeling complex). Fusion 

25 molecules also optionally comprise nuclear localization signals (such as, for example, 
that from the SV40 medium T-antigen) and epitope tags (such as, for example, FLAG, 
myc and hemagglutinin). Fusion proteins (and nucleic acids encoding them) are 
designed such that tiie translational reading frame is preserved among the components 
of the fusion. 

30 Fusions between a polypeptide component of a localization domain (or a 

functional fragment tiiereof) on the one hand, and a non-protein DNA-binding domain 
(e.^., antibiotic, intercalator, minor groove binder, nucleic acid) on the other, are 
constructed by methods of biochemical conjugation known to those of skill in flie art 
iSee, for example, the Pierce Chemical Company (Rockford, IL) Catalogue. Methods 
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and compositions for making fusions between a minor groove binder and a 
polypeptide have been described. Mapp ei al (2000) Proc. Natl Acad. ScL USA 
97:3930-3935. 

The fusion molecules disclosed herein comprise a DNA-binding domain 
5 which binds to a target site. In certain embodiments, the target site is present in an 
accessible region of cellular chromatin. Accessible regions can be determined as 
described, for example, in co-owned PCTAJSOl/13631 and PCT/lJSOl/40617. If the 
target site is not present in an accessible region of cellular chromatin, one or more 
accessible regions can be generated as described in co-owned PCT/USO 1/406 16. In 

10 additional embodiments, the DNA-binding domain of a fusion molecule is capable of 
binding to cellular chromatin regardless of whether its target site is in an accessible 
region or not. For example, such DNA-binding domains are capable of binding to 
linker DNA and/or nucleosomal DNA. Examples of this type of '^pioneer" DNA 
binding domain are found in certain steroid receptor and in hepatocyte nuclear factor 

15 3 (HNF3). Cordingley et al. (1987) Cell 48:261-270; Pina et al (1990) Cell 60:719- 
731; andCirilloe/a/. {l99Z)EMBOJ. 17:244-254. 

Methods of chromatin modification or binding using a localization domain can 
be combined with methods involving binding of endogenous or exogenous 
transcriptional regulators in the region of interest to achieve modulation of gene 

20 expression. Modulation of gene expression can be in &e form of repression as, for 
example, when the target gene resides in a pathological infecting microorganism or in 
an endogenous gene of the subject, such as an oncogene or a viral receptor, that 
contributes to a disease state. Further, as described supra^ repression of a specific 
target gene can be achieved by using a fusion molecule comprising a localization 

25 domain (or functional fragment thereof) and a DNA-binding domain, for 

compartmentalizing the target DNA (and related gene) into a transcriptionally 
repressed nuclear location. 

Alternatively, modulation can be in the form of activation, for example, if 
activation of a gene (e.g., a tumor suppressor gene) can ameliorate a disease state. In 

30 this case, a cell is contacted with a fusion molecule comprising, a localization domain, 
a DNA-binding domain and a transcriptional activation domain. The localization 
domain portion of the fusion molecule localizes it to the repressive compartment of 
the nucleus, where the DNA-binding domain is able to access the target gene. The 
activation domain is then able to activate transcription of the silenced gene, by 
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remoAnng it from the repressive nuclear compartment and/or by recruiting additional 
coactivators that overcome the repressive environment of the target gene. These 
embodiments are particularly suitable for the reactivation of genes whose expression 
has been silenced during development, as such developmental silencing mechanisms 
5 often depend upon methylation of the silenced gene. 

A further exemplary method for reactivation of a gene located in a repressive 
nuclear compartment is to utilize a fusion comprising a localization domain, a DNA- 
binding domain and a component of a chromatin remodeling complex. In this case, 
the localization domain localizes the fusion molecule to a Tq)ressive nuclear 

10 compartment, in which tihe DNA-binding portion of the fusion molecule gains access 
to the target gene. The chromatin remodeling component is able to assemble an 
active chromatin remodeling complex on the target gene, resulting in modijBcation of 
the chromatin structure on ttie target gene into a transcriptionally active conformation. 
Additional embodiments involve the use of a fusion molecule comprising a 

IS DNA-binding domain and a localization domain, in combination with a second 
molecule having transcriptional regulatory activity which binds in the region of 
interest, to regulate expression of one or more target genes. In certain embodiments, 
the second molecule comprises a fusion between a DNA-binding domain and either a 
transcriptional activation domain or a transcriptional repression domain. Any 

20 polypeptide sequence or domain capable of influencing gene expression, which can be 
fused to a DNA-binding domain, is suitable for use. Activation and repression 
domams are known to those of skill in the art and are disclosed, for example, in co- 
owned WO 00/41566. 

Exemplary activation domains include, but are not limited to, VP 16, VP64, 

25 p300, CBP, PCAF,SRC1 FvALF, AtBDD2A and ERF-2. See, for example, Robyr et 
aL (2000) Mo/. EndocrinoL 14:329-347; CoUingwood a/. (1999) J. MoL 
EndocrijwL 23:255-275; Leo e/ a/. (2000) Gene 245:1-11; Manteuffel- 
Cymborowska (1999) Acta Biochim. Pol 46:77-89; McKenna ei al (1999) J, Steroid 
Biochem. MoL Biol 69:3-12; Malik e/ a/. (2000) Trends Biochem, Set 25:277-283; 

30 and Lemon et al (1999) Ctar, Opin. Genet. Dev. 9:499-504. Additional exemplary 
activation domains include, but are not limited to, OsGAI, HALF-1, CI, API, ARF-5, 
-6, -7, and -8, CPRFl, CPRF4, MYC-RP/GP, and TRABl. See, for example, Ogawa 
e/fl/. (2000) Ge/ze 245:21-29; Okanami er^/. (1996) Gewej Ce/fa 1:87-99; Goffer 
al (1991) GefiesDev. 5:298-309; Choetal (1999) Plant Mol Biol 40:419-429; 
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Ulmason et al (1999) Proc. Natl Acad, Set USA 96:5844-5849; Sprenger-Haussels 
etal. (2000) Plant J. 22:1-S; Gong et al. (1999) Plant MoL BioL 41:33-44; and 
Hobo et al (1999) Proc, Natl Acad. ScL USA 96:15,348-15,353. 

Exemplary repression domains include, but are not limited to, KRAB, SID, 
5 MBD2, MBD3, members of the DNMT family {e.g., DNMTl, DNMT3A, 

DNMT3B), Rb, and MeCP2. See, for example, Bird et al (1999) Cell 99:451-454; 
Tyler ef£i/. (1999) Ce// 99:443-446; Knoepfler e/ a/. (1999) Ce// 99:447-450; and 
Robertson et al (2000) Nature Genet 25:338-342. Additional exemplary repression 
domains include, but are not limited to, R0M2 and AtHD2A. iSfee, for example, 

10 Chem et al (1996) Plant Cell 8:305-321; and Wu et al (2000) Plant J, 22:19-27. 
Common regulatory domains for use in a fusion molecule include, e.g., 
effector domains from transcription factors (activators, repressors, co-activators, co- 
repressors), silencers, nuclear hormone receptors, oncogene transcription factors (e.g., 
myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos femily members etc.); DNA 

1 5 repair enzymes and their associated factors and modifiers; DNA rearrangement 

enzymes and their associated factors and modifiers; chromatin associated proteins and 
their modifiers (e.g., kinases, acetylases and deacetylases); and DNA modifying 
enzymes (e.g., methyltransferases, topoisomerases, helicases, ligases, kinases, 
phosphatases, polymerases, endonucleascs) and their associated factors and modifiers. 

20 Transcription fector polypeptides from which one can obtain a regulatory 

domain include those that are involved in regulated and basal transcription. Such 
polypeptides include transcription factors, their effector domams, coactivators, 
silencers, nuclear hormone receptors {see, e,g,, Goodrich et al. Cell 84:825-30 (1996) 
for a review of proteins and nucleic acid elements involved in transcription; 

25 transcription fectors in general are reviewed in Bames & Adcock, Clin, Exp, Allergy 
25 Suppl. 2:46-9 (1995) and Roeder, Methods Enzymol 273:165-71 (1996)). 
Databases dedicated to transcription factors are known (see, eg,. Science 269:630 
(1995)). Nuclear hormone receptor transcription factors are described in, for 
example, Rosen et al, J, Med, Chem. 38:4855-74 (1995). The C/EBP family of 

30 transcription factors are reviewed in Wedel et al^ Immunobiology 193:171-85 (1995). 
Coactivators and co-repressors that mediate transcription regulation by nuclear 
hormone receptors are reviewed in, for example, Meier, Eur, J, Endocrinol 
134(2):158-9 (1996); Kaiser et al. Trends Biochem, Sci. 21 :342-5 (1996); and Utley 
et al, Nature 394:498-502 (1998)). GATA transcription factors, which are involved 
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in regulation of hematopoiesis, are described in, for example, Simon, Nat. Genet 
1 1 :9-l 1 (1995); Weiss et al, Exp. Hematol 23:99-107. TATA box binding protein 
(TBP) and its associated TAF polypeptides (which include TAF30, TAF55, TAF80, 
TAFl 10, TAF150, and TAF250) are described in Goodrich & Tjian, Ciirr. Opin. Cell 
5 BioL 6:403-9 (1994) and Hurley, Cum Opin. Struct Biol 6:69-75 (1996). The STAT 
family of transcription factors are reviewed in, for example, Barahmand-Pour et al, 
Curr, Top. Microbiol. Immunol, 211:121-8 (1996). Transcription factors involved in 
disease are reviewed in Aso et ai^ J. Clin. Invest 97:1561-9 (1996), 

In one embodiment, the KRAB repression domain from the human KOX-1 

10 protein is used as a repression domain (Thiesen et al. New Biologist 2:363-374 

(1990); Margolin et at, PNAS 91 :4509-4513 (1994); Pengue et al, Nucl. Acids Res. 
22:2908-2914 (1994); Witzgall et aK PNAS 91:4514-4518 (1994); see also Example 
3)). In another embodiment, KAP-1, a KRAB co-repressor, is used with KRAB 
(Friedman a/.. Genes Dev. 10:2067-2078(1996)). Other preferred transcription 

15 factors and transcription fector domains that act as transcriptional repressors include 
MAD {see, eg., Sommer et al, J. Biol. Chem. 273:6632-6642 (1998); Gupta et at. 
Oncogene 16:1149-1159 (1998); Queva etal.. Oncogene 16:967-977 (1998); Larsson 
et aL Oncogene 15:737-748 (1997); Laherty et ai, Cell 89:349-356 (1997); and 
Cultraro et al, Mol Cell Biol 17:2353-2359 (19977)); FKHR (forkhead in 

20 rhapdosarcoma gene; Ginsberg et al. Cancer Res. 15:3542-3546 (1998); Epstein et 
al, Mol Cell Biol 18:41 18-4130 (1998)); EGR-1 (early growth response gene 
product-1; Yan etal, PNAS 95:8298-8303 (1998); and Liu etal, Cancer Gene Ther. 
5:3-28 (1998)); the ets2 repressor factor repressor domain (ERD; Sgouras et al, 
EMBOJ. 14:4781-4793 ((19095)); and the MAD smSIN3 interaction domain (SID; 

25 Ayer et al, Mol Cell Biol 16:5772-51^1 (1996)). 

In one embodiment, the HSV VP16 activation domain is used as a 
transcriptional activator (see, e.g., Hagmami et al, /. Virol 71:5952-5962 (1997)). 
Other preferred transcription factors that could supply activation domains include the 
VP64 activation domain (Seipel et al, EMBOJ. 11:4961-4968 (1996)); nuclear 

30 hormone receptors (see, e.g., Torchia etal, Curr. Opin. Cell Biol 10:373-383 

(1998)); the p65 subunit of nuclear factor kappa B (Bitko & Barik, J. Virol 72:5610- 
5618 (1998) and Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); and EGR-l (early 
growth response gene product-1 ; Yan et al, PNAS 95:8298-8303 (1998); and Liu et 
al. Cancer Gene Ther. 5:3-28 (1998)). 
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Kinases, phosphatases, and other proteins that modify polypeptides involved 
in gene regulation are also useful as regulatory domains for use in. fusion molecules. 
Such modifiers are often involved in switching on or off transcription mediated by, 
for example, hormones. Kinases involved in transcription regulation are reviewed in 
5 Davis, MoL Reprod. Dev. 42:459-67 (1995), Jackson et ai, Adv. Second Messenger 
Phosphoprotein Res. 28:279-86 (1993), and Boulikas, CriL Rev. Eukaryot Gene 
Expr. 5:1-77 (1995), while phosphatases are reviewed in, for example, Schonfhal & 
Scmin, Cancer Biol 6:239-48 (1995). Nuclear tyrosine kinases are described in 
Wang, Trends Biochem, ScL 19:373-6 (1994). 

1 0 As described, useful domains can also be obtained from tiie gene products of 

oncogenes (e.g., myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos, erb family 
members) and their associated factors and modifiers. Oncogenes are described in, for 
example. Cooper, Oncogenes^ 2nd ed.. The Jones and Bartlett Series in Biology, 
Boston, MA, Jones and Bartlett Publishers, 1995. The ets transcription factors are 

15 reviewed in Waslylk et ai, Eur. J. Biochem. 21 1 :7-18 (1993) and Crepieux et al, 
Crit Rev, Oncog, 5:615-38 (1994). Myc oncogenes are reviewed in, for example, 
Ryan et al, Biochem, J, 314:713-21 (1996). The jun and fos transcription factors are 
described in, for example. The Fos and Jun Families of Transcription Factors y Angel 
& Herrlich, eds. (1994). The max oncogene is reviewed in Hurlin et al.. Cold Spring 

20 Harb. Symp. Quant. BioL 59:109-16. The myb gene family is reviewed in Kanei-Ishii 
etaL, Ciirr. Top. Microbiol Immunol 211:89-98 (1996). The mos family is reviewed 
in Yew et al, Curr, Opin. Genet. Dev. 3:19-25 (1993). 

Regulatory domains can also be obtained from DNA replication and repair 
enzymes and their associated factors and modifiers. DNA repair systems are 

25 reviewed in, for example, Vos, Curr. Opin. Cell Biol 4:385-95 (1992); Sancar, Ann. 
Rev. Genet, 29:69-105 (1995); Lehmann, Genet. Eng. 17:1-19 (1995); and Wood, 
Ann. Rev. Biochem. 65:135-67 (1996). DNA rearrangement enzymes and their 
associated factors and modifiers can also be used as regulatory domains (see, e.g., 
Gangloff et al, Experientia 50:261-9 (1994); Sadowski, FASEB J. 1:16{^'1 (1993)). 

30 Similarly, regulatory domains can be derived from DNA modifying enzymes 

(e.g., DNA methyltransferases, topoisomerases, helicases, ligases, kinases, 
phosphatases, polymerases) and their associated factors and modifiers. Helicases are 
reviewed in Matson et aL.Bioessays, 16:13-22 (1994), and methyltransferases are 
described in Cheng, Curr. Opin. Struct. Biol 5:4-10 (1995). Chromatin associated 
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proteins and their modifiers (e.g., Icinases, methylases, acetylases and deacetylases), 
such as histone deacetylase (Wolfife, Science 272:371-2 (1996)) are also useful as 
regulatory domains for use in a fusion molecule. In one embodiment, the regulatory 
domain is a DNA methyl transferase that acts as a transcriptional repressor (see, e.g., 
5 Van den Wyngaert et al, FEBSLett 426:283-289 (1998); Flynn et al, J, Mol Biol 
279:101-116 (1998); Okano e/flt/.. Nucleic Acids Res. 26:2536-2540 (1998); and 
Zardo & Caiafa, /. Biol Chem, 273:16517-16520 (1998)). In another embodiment, 
endonuclcases such as Fokl are used as transcriptional repressors, which act via gene 
cleavage (see, e.g., WO 94/18313 and WO95/09233). 

1 0 Factors that control chromatin and DNA structure, movement and localization 

and their associated factors and modifiers; factors derived from microbes (e.g., 
prokaryotes, eukaryotes and virus) and factors fliat associate with or modify them can 
also be used in Ihe synthesis of fusion molecules. In one embodiment, recombinases 
and integrases are used as regulatory domains. In one embodiment, histone 

15 acetyltransferase is used as a transcriptional activator (see, e.g., Jin & Scotto, Mol 
Cell. Biol 18:4377-4384 (1998); Wolffe, Science 21237 U372 (1996); Taunton et al, 
Science 272:408-411 (1996); and Hassig et al, PNAS 95:3519-3524 (1998)). In 
another embodiment, histone deacetylase is used as a transcriptional repressor (see, 
eg., Jin & Scotto, Mol Cell Biol 18:4377-4384 (1998); Syntichaki & Thireos. J, 

20 BioL Chem, 273:24414-24419 (1998); Sakaguchi e/ a/.. Genes Dev. 12:2831-2841 
(1998); and Martinez etal, J. Biol Chem, 273:23781-23785 (1998)). 

AnothCT suitable repression domain is methyl binding domain protein 2B 
(MBD-2B) (see, also Hendrich et al. (1999) Mamm Genome 10:906-912 for 
description of MBD proteins). Another useful repression domain is that associated 

25 witii the v-ErbA protein (see infra). See, for example, Damm, et al. (1989) Nature 
339:593-597; Evans (19S9)Int J. Cancer Suppl4t26'2S; Pain etal. (1990) Afew 
2:284-294; Sap etal. (1989) iVflft/re 340:242-244; Zenke etal. (1988) Ce// 
52:107-1 19; and Zenke et al. (1990) Cell 61:1035-1049. Additional exemplary 
repression domains include, but are not limited to, thyroid hormone receptor (TR, see 

30 infra), SID, MBD 1 , MBD2, MBD3, MBD4, MBD-like proteins, members of Ihc 

DNMT family (e.g., DNMTl, DNMT3A, DNMT3B), Rb, MeCPl and MeCP2. See, 
for example, Bird et al. (1999) Cell 99:451-454; Tyler et al (1999) Cell 99:443-446; 
Knoepfler et al (1999) Cell 99:447-450; and Robertson et al (2000) Nature Genet 
25:338-342, Additional exemplary repression domains include, but are not limited to. 
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R0M2 and AtHD2A. See, for example, Chem et al (1996) Plant Cell 8:305-32 1 ; 
and Wu et al. (2000) Plant / 22: 1 9-27. 

Certain members of the nuclear hormone receptor (NHR) superfamily, 
including, for example, thyroid hormone receptors (TRs) and retinoic acid receptors 
5 (RARs) are among the most potent transcriptional regulators currently known. Zhang 
et al, Annu. Rev. Physiol 62:439-466 (2000) and Sucov et al, Mol Neurobiol 10(2- 
3):169-184 (1995). In the absence of their cognate ligand, these proteins bind with 
high specificity and affinity to short stretches of DNA (e.g., 12-17 base pairs) within 
regulatory loci (e.g., enhancers and promoters) and effect robust transcriptional 

10 repression of adjacent genes. The potency of their regulatory action stems from the 
concurrent use of two distinct functional pathways to drive gene silencing: (i) the 
creation of a localized domain of repressive chromatin via the targeting of a complex 
between the corepressor N-CoR and a histone deacetylase, HD AC3 (Guenther et al. 
Genes Dev 14:1048-1057 (2000); Umov et al, EMBO J 19:4074-4090 (2000); Li et 

15 al, EMBO J 19, 4342-4350 (2000) and Underbill et al, J, Biol Chem, 275:40463- 
40470 (2000)) and (ii) a chromatin-independent pathway (Umov et al, supra) that 
may involve direct interference with the function of the basal transcription machinery 
(Fondell etal. Genes Dev 7(7B): 1400-1410 (1993) andFondell et al, Mol Cell Biol 
16:281-287 (1996). 

20 In the presence of very low (e.g., nanomolar) concentrations of their ligand, 

these receptors undergo a conformational change which leads to the release of 
corepressors, recruitment of a different class of auxiliary molecules (e.g., 
coactivators) and potent transcriptional activation. Collingwood et al, / Mol 
Endocrinol 23(3):255-275 (1999). 

25 The portion of the receptor protein responsible for transcriptional control (e.g., 

repression and activation) can be physically separated from the portion responsible for 
DNA binding, and retains foil functionality when tethered to other polypeptides, for 
* example, other DNA-binding domains. Accordingly, a nuclear hormone receptor 
transcription control domain can be used as a portion of a fusion molecule, such that 

30 the transcriptional regulatory activity of the receptor can be targeted to a 

chromosomal region of interest (e.g., a gene) by virtue of a DNA-binding domain 
(e.g. , a ZFP binding domain). 

Moreover, the structure of TR and other nuclear hormone receptors can be 
altered, either naturally or through recombinant techniques, such that it loses all 
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capacity to respond to honnone (thus losing its ability to drive transcriptional 
activation), but retains the ability to effect transcriptional repression. This approach is 
exemplified by the transcriptional regulatory properties of the oncoprotein v-ErbA. 
The v-ErbA protein is one of the two proteins required for leukemic transformation of 
5 immature red blood cell precursors in young chicks by the avian erythroblastosis 
virus. TR is a major regulator of eiythropoiesis (Beug et ai, Biochim Biophys Acta 
1288(3):M35-47 (1996); in particular, in its unliganded state, it represses genes 
required for cell cycle arrest and the differentiated state. Thus, the administration of 
thyroid hormone to immature erythroblasts leads to their rapid differentiation. The v- 

10 ErbA oncoprotein is an extensively mutated version of TR; these mutations include: 
(i) deletion of 12 amino-terminal amino acids; (ii) fusion to the gag oncoprotein; (iii) 
several point mutations in the DNA binding domain that alter the DNA binding 
specificity of the protein relative to its parent, TR, and impair its ability to 
heterodimerize with the retinoid X receptor; (iv) multiple point mutations in the 

1 5 Ugand-binding domain of the protein tihat effectively eliminate the capacity to bind 
thyroid hormone; and (v) a deletion of a carboxy-terminal stretch of amino acids that 
is essential for transcriptional activation.. Stunnenberg et al^ Biochim Biophys Acta 
1423(l):F15-33 (1999). As a consequence of these mutations, v-ErbA retains the 
capacity to bind to naturally occurring TR target genes and is an effective 

20 transcriptional repressor when bound (Umov et al., supra; Sap et aL, Nature 
340:242-244(1989); md Cifm^ et aL, EMBO J, 17(24):7382-7394 (1999). In 
contrast to TR, however, v-ErbA is completely insensitive to tiiyroid honnone, and 
thus maintains transcriptional repression in the face of a challenge from any 
concentration of thyroids or retinoids, whether endogenous to the medium, or added 

25 by the investigator. 

This functional property of v-ErbA is retained when its repression domain is 
fused to a heterologous, synthetic DNA binding domain. Accordingly, in one aspect, 
V-ErbA or its functional fi-agments are used as a repression domain. In additional 
embodiments, TR or its functional domains are used as a repression domain in the 

30 absence of ligand and/or as an activation domain in the presence of ligand (e.g., 

3,5,3 '-triiodo-L-thyronine or T3). Thus, TR can be used as a switchable functional 
domain (/.e., a bifunctional domain); its activity (activation or repression) being 
dependent upon the presence or absence (respectively) of ligand. 
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Additional exemplary repression domains are obtained from the DAX protein 
and its functional fragments. Zazopoulos et at, Nature 390:3 11-315 (1997). In 
particular, the C-terminal portion of DAX- 1, including amino acids 245-470, has been 
shown to possess repression activity. Altincicek ei al, J. Biol Chem. 275:7662-7667 
5 (2000). A finiher exemplary repression domain is the RBPl protein and its functional 
fragments. Lai ei aL, Oncogene 18:2091-2100 (1999); Lai et ai, MoL Cell Biol 
19:6632-6641 (1999); Lai etaU Mol Cell Biol 21:2918-2932 (2001) and WO 
01/04296. The fiill-length RBPl polypeptide contains 1257 amino acids. Exemplary 
fimctional fragments of RBPl are a polypeptide comprising amino acids 1 1 14-1257, 

10 and a polypeptide comprising amino acids 243-452. 

Members of tiie TDEG family of transcription factors contain three repression 
domains known as Rl, R2 and R3. Repression by TIEG femily proteins is achieved 
at least in part through recruitment of mSIN3 A histone deacetylascs complexes. 
Cooke/ a/. (1999)/. Biol Chem. 274:29,500-29,504; Zhang etal (2001) Mo/. Cell 

15 Biol 21 :5041-5049. Any or all of these repression domains (or their fimctional 

fragments) can be fiised alone, or in combination with additional repression domains 
(or their fimctional fragments), to a DNA-binding domain to generate a targeted 
exogenous repressor molecule. 

Furfliermore, the product of the human cytomegalovirus (HCMV) UL34 open 

20 reading frame acts as a transcriptional repressor of certain HCMV genes, for example, 
the US3 gene. LaPierre e/a/. (2001) J. Kiro/. 75:6062-6069. Accordingly, the UL34 
gene product, or fimctional fragments thereof, can be used as a component of a fiision 
molecule. Nucleic acids encoding such fiisions are also usefiil in the methods and 
compositions disclosed herein. 

25 Yet another exemplary repression domain is the CDF-1 transcription factor 

and/or its fimctional fragments. iSee, for example, WO 99/27092. 

The Dcaros family of proteins are involved in the regulation of lymphocyte 
development, at least in part by transcriptional repression. Accordingly, an Ocaros 
family member (ag., Ikaros, Aiolos) or a functional fragment thereof, can be used as 

30 a repression domain. See, for example, Sabbattini et al (2001) EMBO J. 20:28 1 2- 
2822. 

The yeast Ashlp protein comprises a transcriptional repression domain. 
Maxon et al (2001) Proc, Natl Acad. Scl USA 98:1495-1500. Accordingly, the 
Ashlp protein, its fimctional fragments, and homologucs of Ashlp, such as those 
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found, for example, in, vertebrate, mammalian, and plant cells, can serve as a 
repression domain for use in the methods and compositions disclosed herein. 

Additional exemplary repression domains include those derived from histone 
deacetylases (HDACs, e.g.. Class I HDACs, Class n HDACs, SIR-2 homologues), 
5 HDAC-interacting proteins (e.g,, SIN3, SAP30, SAP15, NCoR, SMRT, RB, pl07, 
pl30, RBAP46/48, MTA, Mi-2, Brgl, Brm), DNA-cytosine methyltransferases (e.g., 
Dnmtl, DnmtSa, Dnmt3b), proteins that bind methylated DNA (e.g., MBDl, MBD2, 
MBD3, MBD4, MeCP2, DMAPl), protein metfiyltransferases (e.g., lysine and 
arginine methylases, SuVar homologues such as Suv39Hl), polycomb-type repressors 

10 Bmi-1, eedl, RINGl, RYBP, E2F6, Mell8, YYl and CtBP), viral repressors 

(e.g., adenovirus Elb 55K protein, cytomegalovirus UL34 protein, viral oncogenes 
such as v-erbA), hormone receptors (e.g,, Dax-1, estrogen receptor, thyroid hormone 
receptor), and repression domains associated with naturally-occurring zinc finger 
proteins (e,g.^ WTl, KAPl). Further exemplary repression domains include members 

15 of the polycomb complex and their homologues, HPHl, HPH2, HPC2, NC2, groucho. 
Eve, tramtrak, mHPl, SIPl, ZEBl, ZEB2, and Enxl/Ezh2. In all of these cases, 
either the full-length protein or a functional fragment can be used as a repression 
domain in a fusion molecule. Furfliermore, any homologues of the aforementioned 
proteins can also be used as repression domains, as can proteins (or their functional 

20 fragments) that interact with any of the aforementioned proteins. 

Additional repression domains, and exemplary functional fragments, are as 
follows. Hesl is a human homologue of the Drosophila hairy gene product and 
comprises a functional fragment encompassing amino acids 910-1014. In particular, a 
WRPW (tip-arg-pro-trp) motif can act as a repression domain. Fisher et al (1996) 

25 Mol Cell Biol 16:2670-2677. 

The TLEl, TLE2 and TLE3 proteins are human homologues of the 
Drosophila groucho gene product Functional fragments of these proteins possessing 
repression activity reside between amino acids 1-400, Fisher et al, supra. 

The Tbx3 protein possesses a fimctional repression domain between amino 

30 acids 524-721. He et al (1999) Proc. Natl Acad, Scl USA 96:10,212-10,217. The 
Tbx2 gene product is involved in repression of the pl4/pl6 genes and contains a 
region between amino acids 504-702 that is homologous to the repression domain of 
Tbx3; accordingly Tbx2 and/or this functional fragment can be iised as a repression 
domain. Carreira a/. (1998) Afo/. Cell Biol 18:5,099-5,108. 
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The human Ezh2 protein is a homologue of Drosophila enhancer of zeste and 
recruits the eedl poiycomb-typo repressor. A region of the Ezh2 protein comprising 
amino acids 1-193 can interact with eedl and repress transcription; accordingly Ezh2 
and/or this functional iBragment can be used as a repression domain. Denisenko et al 
5 (1998) Mol Cell Biol 18:5634-5642. 

The RYBP protein is a corepressor that interacts with potycomb complex 
members and with the YYl transcription factor. A region of RYBP comprising 
amino acids 42-208 has been identified as functional repression domain. Garcia ei al 
{1999) EMBO J. 18:3404-3418. 

10 The RING finger protein RINGl A is a member of two different vertebrate 

pofycomb-typt complexes, contains multiple binding sites for various components of 
the polycomb complex, and possesses transcriptional repression activity. 
Accordingly, RINGl A or its functional fragments can serve as a repression domain. 
Satjine/fl/. (1997) M?/. Cell Biol 17:4105-4113. 

1 5 The Bmi- 1 protein is a member of a vertebrate polycomb complex and is 

involved in transcriptional silencing. It contains multiple binding sites for various 
polycomb complex components. Accordingly, Bmi-1 and its functional fragments are 
useful as repression domains. Gunsterc/a/. (1997) Afo/. Cell Biol 17:2326-2335; 
Hemenway et al (1998) Oncogene 16:2541-2547. 

20 The E2F6 protein is a member of the mammalian Bmi-1 -containing polycomb 

complex and is a transcriptional repressor that is capable or recruiting RYBP, Btni-1 
and RINGIA. A ftmctional fragment of E2F6 comprising amino acids 129-281 acts 
as a transcriptional repression domain. Accordingly, E2F6 and its functional 
fragments can be used as repression domains. Trimarchi et al (2001) Froc Natl 

25 Acad, ScL USA 98:1519-1524. 

The eedl protein represses transcription at least in part through recruitment of 
histone deacetylases (e.g., HDAC2). Repression activity resides in both the N- and C- 
terminal regions of the protein. Accordingly, eedl and its functional fragments can be 
used as repression domains, van der Vlag et al {1999) Nature Genet 23:474-478. 

30 The CTBP2 protein represses transcription at least in part through recruitment 

of an HBCl-polycomb complex. Accordingly, CTBP2 and its functional fragments 
are useful as repression domains. Richard et al (1999) Mol Cell Biol 19:777-787. 
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Neuron-restrictive silencer &ctors are proteins that repress expression of 
neuron-specific genes. Accordingly, a NRSF or functional fragment thereof can serve 
as a repression domain. See, for example, US Patent No. 6,270,990. 

It will be clear to those of skill in the art that, in the formation of a fusion 
5 protein (or a nucleic acid encoding same) between a DNA-binding domain and a 
regulatory domain, either a repressor or a molecule that interacts with a repressor is 
suitable as a repression domain. Essentially any molecule capable of recruiting a 
repressive complex and/or repressive activity (such as, for example, histonc 
deacetylation) to the target gene is useful as a repression domain of a fusion protein. 

10 Additional exemplary activation domains include, but are not limited to, p300, 

CBP, PCAF, SRCl PvALF, AtHD2A and ERF-2. See, for example, Robyr et al. 
(2000) Mol. Endocrinol. 14:329-347; Collingwood etal. (1999) J. Mol. Endocrinol. 
' 23:255-275; Leo et al. (2000) Gene 245:1-11; Manteuffel-Cymborowska (1999) 
Acta Biochim. Pol. 46:77-89; McKcnna et al. (1999) J. Steroid Biochem. Mol. Biol. 

15 69:3-12; Malik et al. (2000) Trends Biochem. Sci. 25:277-283; and Lemon et al. 

(1999) Curr. Opin. Genet. Dev. 9:499-504. Additional exemplary activation domains 
include, but are not limited to, OsGAI, HALF-1, CI, API, ARF-5, -6, -7, and -8, 
CPRFl, CPRF4, MYC-RP/GP, and TRABl. See, for example, Ogawa et al. (2000) 
Gene 245:21-29; Okanami et al. (1996) Genes Cells 1:87-99; Goff et al. (1991) 

20 Genes Dev. 5:298-309; Cho et al. (1999) Plant Mol. Biol. 40:419-429; Ulmason et 
al. (1999) Proc. Natl. Acad. Sci. USA 96:5844-5849; Sprenger-Haussels etal. (2000) 
Plant J. 22:1-8; Gong etal (1999) Plant Mol. Biol. 41:33-44; and Hobo etal. (1999) 
Proc. Natl. Acad. Sci. USA 96:15,348-15,353. 

It will be clear to those of skill in the art that, in the formation of a fusion 

25 protein (or a nucleic acid encoding same), either an activator or a molecule that 
interacts with an activator is suitable as a regulatory domain. Essentially any 
molecule capable of recruiting an activating complex and/or activating activity (such 
as, for example, histone acetylation) to the target gene is useful as an activating 
domain of a fusion molecule. 

30 Chromatin remodeling protems and components of chromatin remodeling 

complexes for use as regulatory domains in fusion molecules are described, for 
example, in co-owned PCT application USOl/40616. 

In a further embodiment, a DNA-binding domain (e.g., a zinc finger domain) 
is fused to a bifunctional domain (BFD). A bifunctional domain is a transcriptional 
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regulatoiy domain whose activity depends upon interaction of the BFD with a second 
molecule. The second molecule can be any type of molecule capable of influencing 
the fiinctional properties of the BFD including, but not limited to, a compound, a 
small molecule, a peptide, a protein, a polysaccharide or a nucleic acid. An 
5 exemplary BFD is the ligand binding domain of the estrogen receptor (ER). In the 
presence of estradiol, the ER ligand binding domain acts as a transcriptional activator; 
while, in the absence of estradiol and the presence of tamoxifen or 4-hydroxy- 
tamoxifen, it acts as a transcriptional repressor. Another example of a BFD is the 
thyroid hormone receptor (TR) ligand binding domain which, in the absence of 

1 0 ligand, acts as a transcriptional repressor and in the presence of thyroid hormone (T3), 
acts as a transcriptional activator. An additional BFD is the glucocorticoid receptor 
(GR) ligand binding domain. In the presence of dexamethasone, this domain acts as a 
transcriptional activator; while, in the presence of RU486, it acts as a transcriptional 
repressor. An additional exemplary BFD is the ligand binding domain of the retinoic 

15 acid receptor. In the presence of its ligand all-trans-retinoic acid, the retinoic acid 
receptor recruits a number of co-activator complexes and activates transcription. In 
the absence of ligand, the retinoic acid receptor is not capable of recruiting 
transcriptional co-activators. Additional BFDs are known to those of skill in the art 
See, for example, US Patent Nos. 5,834,266 and 5,994,313 and PCX WO 99/10508. 

20 In additional embodiments, a plurality of fusion molecules can be used in the 

disclosed methods. For example, a plurality of localization domain/DNA-binding 
domain fusions can be used; and a plurality of localization domain/DNA-binding 
domain/regulatory domain fusions can be used. 

For these and other applications, exogenous molecules can be formulated with 

25 a pharmaceutically acceptable carrier, as is known to those of skill in the art. See, for 
example, Remington*s Pharmaceutical Sciences, 17* ed., 1985; and co-owned WO 
00/42219. 



Polynucleotide and Polypeptide Delivery 

30 The compositions described herein can be provided to the target cell in vitro or 

in vivo. In addition, the compositions can be provided as polypeptides, 
polynucleotides or combination thereof. 
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A. Delivery of Polvpudcotides 

la certain embodiments, the compositions are provided as one or more 
polynucleotides. Further, as noted above, a localization domain-containing 
composition can be designed as a fusion between a polypeptide DNA-binding domain 
5 and a localization domain and can be encoded by a fusion nucleic acid. In both fiision 
and non-fusion cases, the nucleic acid can be cloned into intermediate vectors for 
transformation into prokaryotic or eukaryotic cells for replication and/or expression. 
Intermediate vectors for storage or manipulation of the nucleic acid or production of 
protein can be prokaryotic vectors, (e.g., plasmids), shuttle vectors, insect vectors, or 

10 viral vectors for example. A nucleic acid encoding a localization domain or a 
localization domain fusion can also cloned into an expression vector, for 
administration to a bacterial cell, fungal cell, protozoal cell, plant cell, or animal cell, 
preferably a mammalian cell, more preferably a human cell. 

To obtain expression of a cloned nucleic acid, it is typically subcloned into an 

1 5 expression vector that contains a promoter to direct transcription. Suitable bacterial 
and eukaryotic promoters are well known in the art and described, in Sambrook 
ef al.y supra; Ausubel ei aL, supra; and Kriegler, Gene Transfer and Expression: A 
Laboratory Manual (1990). Bacterial expression systems are available in, E. 
coli. Bacillus sp., and Salmonella, Palva et al. (1983) Gene 22:229-235. Kits for 

20 such expression systems are commercially available. Eukaryotic expression systems 
for mammahan cells, yeast, and insect cells are well known in the art and are also 
commercially available, for example, from Invitrogen, Carlsbad, CA and Clontech, 
Palo Alto, CA. 

The promoter used to direct expression of the nucleic acid of choice depends 
25 on the particular application. For example, a strong constitutive promoter is typically 
used for expression and purification. In contrast, when a dedifferentiation protein is 
to be used in vivo, either a constitutive or an inducible promoter is used, depending on 
the particular use of the protein. In addition, a weak promoter can be used, such as 
HSV TK or a promoter having similar activity. The promoter typically can also 
30 include elements that are responsive to transactivation, e.g., hypoxia response 
elements, GaI4 response elements, lac repressor response element, and small 
molecule control systems such as tet-regulated systems and the RU-486 system. See, 
e.g., Gossen et al. (1992) Proc. Natl Acad. Sci USA 89:5547-5551; Oligiuo et 
fl/.(1998) Gene 27ier. 5:491-496; Wang eMl (1997) Gene ITier. 4:432-441; Neering 
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etai (1996) 88:1147-1155; and Rendahl {199%) Nat Biotechnol 
16:757-761. 

In addition to a promoter, an expression vector typically contains a 
transcription unit or expression cassette that contains additional elements required for 
5 the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A 
typical expression cassette thus contains a promoter operably linked, e.g., to the 
nucleic acid sequence, and signals required, eg., for efficient polyadenylation of fiie 
transcript, transcriptional termination, ribosome binding, and/or translation 
termination. Additional elements of the cassette may include, eg., enhancers, and 

10 heterologous spliced intronic signals. 

The particular expression vector used to transport the genetic information into 
the cell is selected with regard to the intended use of the encoded polypeptide, e.g., 
expression in plants, animals, bacteria, fungi, protozoa etc. Standard bacterial 
expression vectors include plasmids such as pBR322, pBR322-based plasmids, pSKF, 

15 pET23D, and commercially available fusion expression systems such as GST and 
LacZ. Epitope tags can also be added to recombinant proteins to provide convenient 
methods of isolation, for monitoring expression, and for monitormg cellular and 
subcellular localization, e.g., hemagglutinin (HA), c-myc or FLAG. 

Expression vectors containing regulatory elements from eukaryotic viruses are 

20 often used m eukaryotic expression vectors, &g., SV40 vectors, papilloma virus 
vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic 
vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus 
pDSVE, and any other vector allowing expression of proteins under the direction of 
the SV40 early promoter, SV40 late promoter, CMV promoter, metallothionein 

25 promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, 

polyhedrin promoter, or other promoters shown effective for expression in eukaryotic 
cells. 

Some expression systems have markers for selection of stably transfected cell 
lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate 
30 reductase. High-yield expression systems are also suitable, such as baculovirus 

vectors in insect cells, with an inserted nucleic acid sequence under the transcriptional 
control of the polyhedrin promoter or any other strong baculovirus promoter. 

Elements that are typically included in expression vectors also include a 
replicon that functions in E. coli (or in the prokaryotic host, if other than E. coli)y a 
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selective marker, e.g., a gene encoding antibiotic resistance, to permit selection of 
bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential 
regions of the vector to allow insertion of recombinant sequences. 

Standard transfection methods can be used to produce bacterial, mammalian, 
5 yeast, insect, or other cell lines that express large quantities of heterologous proteins, 
which can be purified, if desired, using standard techniques. See, e,g., Colley et al 
(1989) J. Biol Chem. 264:17619-17622; and Guide to Protein Purification, in 
Methods in Enzymology^ vol. 182 (Deutscher, ed.) 1990. Transformation of 
eukaryotic and prokaryotic cells are performed according to standard techniques. See, 

10 e.g., Morrison (1977) 7. BacterioL 132:349-35 1 ; Clark-Curtiss et al, (1983) in 
Methods in Enzymology 101:347-362 (Wu et al,, eds). 

Any procedure for introducing foreign nucleotide sequences into host cells can 
be used. These include, but are not limited to, the use of calcium phosphate 
transfection, DEAE-dextran-mediated transfection, polybrene, protoplast fusion, 

15 electroporation, lipid-mediated delivery (e.g., liposomes), micromjection, particle 
bombardment, introduction of naked DNA, plasmid vectors, viral vectors (both 
episomal and integrative) and any of the other well known methods for introducing 
cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a 
host ceU (see, e.g., Sambrook et ai^ supra). It is only necessary that the particular 

20 genetic engineering procedure used be capable of successfully introducing at least one 
gene into the host cell capable of expressing the protein of choice. 

Conventional viral and non-viral based gene transfer methods can be used to 
introduce nucleic acids into mammalian cells or target tissues. Such methods can be 
used to administer nucleic acids encoding fusion polypeptides to cells in vitro. 

25 Preferably, nucleic acids are administered for in vivo or ex vivo gene therapy uses. 
Non-viral vector delivery systems include DNA plasmids, naked nucleic acid, and 
nucleic acid complexed with a delivery vehicle such as a liposome. Viral vector 
delivery systems include DNA and KNA viruses, which have either episomal or 
integrated genomes after delivery to the cell. For reviews of gene therapy procedures, 

30 see, for example, Anderson (1992) Science 256:808-813; Nabel et aL (1993) Trends 
Biotechnol 11:211-217; Mitani etal (1993) Trends BiotechnoL 11:162-166; Dillon 
(1993) Trends BiotechnoLll:\61A15; Miller (1992) Mirwre 357:455-460; Van 
Brunt (1988) Biotechnology 6(10): 1 149-1 154; Vigne (1995) Restorative Neurology 
and Neuroscience 8:35-36; Kremer et al (1995) British Medical Bulletin 51(1):3 1 - 
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44; Haddada et aL, in Current Topics in Microbiology and Immunology, Doerfler and 
Bohm (eds), 1995; and Yu et al (1994) Gene Therapy 1:13-26. 

Methods of non-viral delivery of nucleic acids include lipofection, 
microinjection, ballisticsj virosomcs, liposomes, immunoliposomes, polycation or 
S lipid:nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced 
uptake of DNA. Lipofection is described in, e.g., U.S. Patent Nos. 5,049,386; 
4,946,787; and 4,897,355 and lipofection reagents are sold commercially (e.g., 
Transfectam™ and Lipofectin™). Cationic and neutral lipids that are suitable for 
efficient lipofection of polynucleotides include those of Feigner, WO 91/17424 and 

10 WO 91/16024. Nucleic acid can be delivered to cells {in vitro or ex vivo 
administration) or to target tissues (in vivo administration). 

The preparation of lipid:nucleic acid complexes, including targeted liposomes 
such as immunolipid complexes, is well known to those of skill in the art. See, e.g., 
Crystal (1995) Science 270:404-410; Blaese et al (1995) Cancer Gene Ther. 2:291- 

15 297; Behr et al (1994) Bioconjugate Chem. 5:382-389; Remy ei al (1994) 

Bioconjugate Chem. 5:647-654; Gao et al (1995) Gene Therapy 2:710-722; Ahmad 
et al (1992) Cancer Res, 52:4817-4820; and U.S. Patent Nos. 4,186,183; 4,217,344; 
4,235,871; 4,261,975; 4,485,054; 4,501,728; 4,774,085; 4,837,028 and 4,946,787. 
The use of RNA or DNA virus-based systems for the delivery of nucleic acids 

20 takes advantage of highly evolved processes for targeting a vims to specific cells in 
the body and trafGcking the viral pay load to the nucleus. Viral vectors can be 
administered directly to patients {in vivo) or they can be used to treat cells in vitro, 
wherein the modified cells are administered to patients {ex vivo). Conventional viral 
based systems for the delivery of nucleic acids include retroviral, lentiviral, poxviral, 

25 adenoviral, adeno-associated viral, vesicular stomatitis viral and herpesviral vectors. 
Integration in the host genome is possible with certain viral vectors, including'the 
retrovffus, lentivinis, and adeno-associated virus gene transfer methods, often 
resulting in long term expression of the inserted transgene. Additionally, high 
transduction efficiencies have been observed in many different cell types and target 

30 tissues. 

The tropism of a retrovirus can be altered by incorporating foreign envelope 
proteins, allowing alteration and/or expansion of the potential target cell population. 
Lentiviral vectors are retroviral vector that are able to transduce or infect non-dividing 
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cells and typically produce high viral titers. Selection of a retroviral gene transfer 
system would therefore depend on the target tissue. Retroviral vectors have a 
packaging capacity of up to 6-10 kb of foreign sequence and are comprised of cis- 
acting long terminal repeats (LTRs). The minimum ci^-acting LTRs are suflBcient for 
5 replication and packaging of the vectors, which are flien used to integrate the 
therapeutic gene into the target cell to provide permanent transgene expression. 
Widely used retroviral vectors include those based upon murine leukemia virus 
(MuLV), gibbon ape leukemia virus (GaLV), simian immunodeficiency virus (SIV), 
human immunodeficiency virus (HIV), and combinations thereof. Buchscher et ah 

10 (1992)7. ViroL 66:2731-2729; Johaim a/. (1992) J. Pzro/. 66:1635-1640; 

Sommerfelte/flt (1990) Virol. 176:58-59; Wilson e/ a/. (1989) J. ViroL 63:2374- 
2378; MiUer era/. (1991)/. ViroL 65:2220-2224; and PCT/US94/05700). 

Adeno-associated virus (AAV) vectors are also used to transduce cells with 
target nucleic acids, eg., in flie in vitro production of nucleic acids and peptides, and 

15 for in vivo and ex vivo gene therapy procedures. See, e,g,. West et al (1987) Virology 
160:38-47; U.S. Patent No. 4,797,368; WO 93/24641; Kotin (1994) ffw/n. Gene 
Tlier. 5:793-801; and Muzyczka (1994) J. Clin, Invest 94:1351. Constructian of 
recombinant AAV vectors are described in a number of publications, including U.S. 
Patent No. 5,173,414; Tratschin e/ a/. (1985) Mo/. Cell BioL 5:3251-3260; 

20 Tratschin, et al (1984) Mol Cell Biol 4:2072-2081; Hermonat et al (1984) Proc, 
Natl Acad. Set t/&4 81:6466-6470; and Samulski o/. (1989)/ Virol 63:3822- 
3828. 

Recombinant adeno-associated virus vectors based on the defective and 
nonpatiiogenic parvovirus adeno-associated virus type 2 (AAV-2) are a promising 

25 gene delivery system. Exemplary AAV vectors are derived fi-om a plasmid 

containing the AAV 145 bp inverted terminal repeats flanking a transgene expression 
cassette. Efficient gene transfer and stable transgene delivery due to integration into 
the genomes of the transduced cell are key features for this vector system. Wagner et 
al (1998) Lancet 3S1®(9117): 1702-3; and Keams etal (1996) Gene Ther, 9:748- 

30 55. 

pLASN and MFG-S are examples are retroviral vectors that have been used in 
clinical trials. Dunbar et al (1995) Blood 85:3048-305; Kohn et al (1995) Nature 
Med. 1:1017-102; Malech et al (1997) Proc. Nail Acad. Set USA 94:12133-12138. 
PA3 17/pLASN was the first therapeutic vector used in a gene therapy trial. (Blaese et 
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al. (1995) Science 270:475-480. Transduction efficiencies of 50% or greater have 
been observed for MFG-S packaged vectors. Ellem et al. (1997) Immunol 
Immiinother, 44(1):10"20; Dranoffera/. {1991) Hum. Gene Ther. 1:111-2. 

In applications for which transient expression is preferred, adenoviial-based 
5 systems are useful. Adenoviral based vectors are capable of very high transduction 
efficiency in many cell types and are capable of infecting, and hence deKvering 
nucleic acid to, both dividing and non-dividrag cells. With such vectors, high titers 
and levels of expression have been obtained. Adenovirus vectors can be produced in 
large quantities in a relatively simple system. 

1 0 Replication-deficient recombinant adenoviral (Ad) can be produced at high 

titer and they readily infect a number of different cell types. Most adenovirus vectors 
are engineered such that a transgene replaces the Ad Ela, Elb, and/or E3 genes; the 
replication defector vector is propagated in human 293 cells that supply the required 
El functions in trans. Ad vectors can transduce multiple types of tissues in vivo^ 

1 5 including non-dividing, differentiated cells such as those found in the liver, kidney 
and muscle. Conventional Ad vectors have a large carrying capacity for inserted 
DNA. An example of the use of an Ad vector in a clinical trial involved 
polynucleotide tiierapy for antitumor immunization with intramuscular mjection. 
Sterman etal. (\99S) Hum. Gene Ther. 7:1083-1089. Additional examples of the use 

20 of adenovirus vectors for gene transfer in clinical trials include Rosenecker et al 
(1996) Infection 24:5-10; Sterman et al, supra; Welsh et al (1995) Hum. Gene 
7%er. 2:205-218; Alvarez erf a/. (1997) f^Mm. G^e 7%er. 5:597-613; andTopfe/a/. 
(1998) Gene Ther. 5:507-513. 

Packaging cells are used to form virus particles that are capable of infecting a 

25 host cell Such cells mclude 293 cells, which package admovirus, and ^2 cells or 
PA317 cells, which package retroviruses. Viral vectors used in gene therapy are 
usually generated by a producer cell line ttiat packages a nucleic acid vector into a 
viral particle. The vectors typically contain the minimal viral sequences required for 
packaging and subsequent integration into a host other viral sequences being replaced 

30 by an expression cassette for the protein to be expressed. Missing viral functions are 
supplied in trans ^ if necessary, by the packaging cell line. For example, AAV vectors 
used in gene therapy typically only possess ITR sequences from the AAV genome, 
which are required for packaging and integration into the host genome. Viral DNA is 
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packaged in a cell line, which contains a helper plasmid encoding the other AAV 
genes, namely and cap^ but lacking ITR sequences. The cell line is also infected 
with adenovirus as a helper. The helper virus promotes replication of the AAV vector 
and expression of AAV genes from the helper plasmid. The helper plasmid is not 
5 packaged in significant amoimts due to a lack of ITR sequences. Contamination with 
adwiovirus can be reduced by, e.g., heat treatment, which preferentially inactivates 
adenoviruses. 

In many gene therapy applications, it is desirable that the gene therapy vector 
be delivered with a high degree of specificity to a particular tissue type. A viral 

10 vector can be modified to have specificity for a given cell type by expressing a ligand 
as a fusion protein with a viral coat protein on the outer surface of the virus. The 
ligand is chosen to have affinity for a receptor known to be present on the cell type of 
interest. For example, Han et al (1995) Proc. Natl Acad. Scl USA 92:9747-975 1 
reported that Moloney murine leukemia virus can be modified to express human 

1 5 heregulin fused to gp70, and the recombinant virus infects certain human breast 

cancer cells expressing human epidermal growth factor receptor. This principle can 
be extended to other pairs of virus expressing a ligand fusion protein and target cell 
expressing a receptor. For example, filamentous phage can be engineered to diq)lay 
antibody Segments (e.g., Fab or Fy) having specific binding affinity for virtually any 

20 chosen cellular receptor. Although the above description applies primarily to viral 
vectors, the same principles can be applied to non-viral vectors. Such vectors can be 
engineered to contain specific uptake sequences thought to favor uptake by specific 
target cells. 

Gene therapy vectors can be delivered in vivo by administration to an 
25 individual patient, typically by systemic administration (e.g., intravenous, 
intraperitoneal, intramuscular, subdermal, or intracranial infusion) or topical 
application, as described infra. Alternatively, vectors can be delivered to cells ex 
vivo, such as cells explanted from an individual patient (e.g., lymphocytes, bone 
marrow aspirates, tissue biopsy) or universal donor hematopoietic stem cells, 
30 followed by reimplantation of the cells into a patient, usually after selection for cells 
which have incorporated the vector. 

Ex vivo cell transfection for diagnostics, research, or for gene therapy (eg., via 
re-infusion of the transfected cells into the host organism) is well known to those of 
skill in the art. In a preferred embodiment, cells are isolated from the subject 
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organism, transfected with a nucleic acid (gene or cDNA), and re-infosed back into 
the subject organism (e.g.j patient). Various cell types suitable for ex vivo 
transfection are well known to those of skill in the art. See, e.g., Freshney et aL, 
Culture of Animal Cells, A Manual of Basic Technique, 3rd ed,, 1994, and references 
5 cited therein, for a discussion of isolation and culture of cells from patients. 

In one embodiment, hematopoietic stem cells are used in ex vivo procedures 
for cell transfection and gene therapy. The advantage to using stem cells is that they 
can be differentiated into other cell types in vitro, or can be introduced into a mammal 
(such as the donor of the ceUs) where they will engraft in the bone marrow. Methods 

10 for differentiating CD34+ stem cells in vitro into clinically important immune cell 
types using cytokines such a GM-CSF, IFN-y and TNF-a are known. Inaba et al 
(1992)7. Exp. Med, 176:1693-1702. 

Stem cells are isolated for transduction and differentiation using known 
methods. For example, stem cells are isolated from bone marrow cells by panning the 

15 bone marrow cells with antibodies which bind unwanted cells, such as CD4+ and 
CD8+ (T cells), CD45+ (panB cells), GR-1 (granulocytes), and lad (differentiated 
antigen presenting cells). See Inaba et al, supra. 

Vectors (e.g., retroviruses, adenoviruses, liposomes, etc.) containing 
therapeutic nucleic acids can be also administered directly to the organism for 

20 transduction of cells 2n V2va. Alternatively, naked DNA can be administered. 

Administration is by any of the routes noimally used for introducing a molecule into 
ultimate contact with blood or tissue cells. Suitable methods of administering such 
nucleic acids are available and well known to tiiose of skill in the art, and, although 
more than one route can be used to administer a particular composition, a particular 

25 route can often provide a more immediate and more effective reaction than ano&er 
route. 

Pharmaceutically acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to 
administer the composition. Accordingly, there is a wide variety of suitable 
30 formulations of pharmaceutical compositions, as described below. See, e.g.. 
Remington's Pharmaceutical Sciences, 17th ed., 1989. 
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B. Delivery of Polypeptides 

In other embodiments, for example in certain in vitro situations, target cells 
are cultured in a medium containing localization domain fusion polypeptides or 
functional fragments thereof. 
5 An important factor in the administration of polypeptide compounds is 

ensuring that the polypeptide has the ability to traverse the plasma membrane of a 
cell, or the membrane of an intracellular compartment such as the nucleus. Cellular 
membranes are composed of lipid-protein bilayers that are freely permeable to small, 
nonionic lipophilic compounds and are inherently impermeable to polar compounds, 

10 macromolecules, and therapeutic or diagnostic agents. However, proteins, lipids and 
other compounds, which have the ability to translocate polypeptides across a cell 
membrane, have been described. 

For example, "membrane translocation polypeptides" have amphiphilic or 
hydrophobic amino acid subsequences that have the ability to act as membrane- 

15 translocating carriers. In one embodiment^ homeodomain proteins have the ability to 
translocate across cell membranes. The shortest intemaUzable peptide of a 
homeodomain protein, Antennapedia^ was foxmd to be the third helix of the protein, 
from amino acid position 43 to 58. Prochiantz (1996) Curr, Opin, Neurobiol. 6:629- 
634. Another subsequence, the h (hydrophobic) domain of signal peptides, was found 

20 to have similar cell membrane translocation characteristics. Lin et al (1995) J, Biol. 
Chem, 270:14255-14258. 

Examples of peptide sequences which can be linked to a polypeptide for 
facilitating its uptake into cells include, but are not limited to: an 1 1 amino acid 
peptide of the tat protein of HIV; a 20 residue peptide sequence which corresponds to 

25 amino acids 84-103 of the pl6 protein (see Fahraeus ei al, (1996) Curr. Biol 6:84); 
the third helix of the 60-amino acid long homeodomain of Antennapedia (Derossi ei 
al (1994) J. Biol Chem. 269: 10444); the h region of a signal peptide, such as the 
Kaposi fibroblast growth fector (K-FGF) h region (Lin et al, sttpra); and the VP22 
translocation domain from HSV (Elliot et al (1997) Cell 88:223-233). Other suitable 

30 chemical moieties that provide enhanced cellular uptake can also be linked, either 
covalently or non-covalently, to the fusion polypeptides disclosed herein. 

Toxin molecules also have the abihty to transport polypeptides across cell 
membranes. Often, such molecules (called "binary toxins") are composed of at least 
two parts: a translocation or binding domain and a separate toxin domain. Typically, 
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the translocation domain, which can optionally be a polypeptide, binds to a cellular 
receptor, facilitating transport of the toxin into the cell. Several bacterial toxins, 
including Clostridium perfringens iota toxin, diphtheria toxin (DT), Pseudomonas 
exotoxin A (PE), pertussis toxin (PT), Bacillus anthracis toxin, and pertussis 
5 adenylate cyclase (CYA), have been used to deliver peptides to the cell cytosol as 
internal or amino-terminal fusions. Arora et al (1993) J. Biol Chem. 268:3334-3341; 
Perelle et al. (1993) Infect. Immun, 61:5147-5 156; Stenmark et al (1 991) /. Cell 
Biol 113:1025-1032; Donnelly ei al (1993) Proc, Nail Acad ScL USA 90:3530- 
3534; Carbonetti et al (1995) Abstr. Annu, Meet Am. Soc. Microbiol 95:295; Sebo 
10 et al (1995) Infect Immun, 63:3851-3857; Klimpel et al (1992) Proc. Natl Acad 
ScL C/S^. 89:10277-10281; and Novak a/. (1992) J. Biol Chem. 267:17186- 
17193. 

Such subsequences can be used to translocate polypeptides, including the 
fusion polypeptides disclosed herein, across a cell membrane. This is accomplished, 

1 5 for example, by derivatizing the fusion polypeptide with one of these translocation 
sequences, or by forming an additional fusion of the translocation sequence with the 
fusion polypeptide. Optionally, a linker can be used to link the fusion polypeptide and 
the translocation sequence. Any suitable linker can be used, a peptide linker. 

A suitable polypeptide can also be mtroduced into an animal cell, preferably a 

20 mammalian cell, via liposomes and liposome derivatives such as immunoliposomes. 
The term "liposome" refers to vesicles comprised of one or more concentrically 
ordered lipid bilayers, which encapsulate an aqueous phase. The aqueous phase 
typically contains the compound to be delivered to the cell. 

The liposome fuses with the plasma membrane, thereby releasing the 

25 compound into the cytosol. Alternatively, the liposome is phagocytosed or taken up 
by the cell in a transport vesicle. Once in the endosome or phagosome, the liposome 
is either degraded or it fuses with the membrane of the transport vesicle and releases 
its contents. 

In current methods of drug delivery via liposomes, the liposome ultimately 
30 becomes permeable and releases the encapsulated compound at the target tissue or 
cell. For systemic or tissue specific delivery, this can be accomplished, for example, 
in a passive manner wherein the liposome bilayer is degraded over time through the 
action of various agents in the body. Alternatively, active drug release involves using 
an agent to induce a permeability change in the liposome vesicle. Liposome 
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membranes can be constructed so that they become destabilized when the 
environment becomes acidic near the liposome membrane. See^ e.g., Proa Nad. 
Acad. ScL USA 84:7851 (1987); Biochemistry 28:908 (1989). When liposomes are 
endocytosed by a target cell, for example, they become destabilized and release their 
5 contents. This destabilization is termed fusogenesis. 

Dioleoylphosphatidylethanolamine (DOPE) is the basis of many "fusogenic" systems. 

For use witii the methods and compositions disclosed herein, liposomes 
typically comprise a fiision polypeptide as disclosed herein, a lipid component, e.g., a 
neutral and/or cationic lipid, and optionally include a receptor-recognition molecule 

10 such as an antibody that binds to a predetermined cell surface receptor or ligand (e.g^., 
an antigen). A variety of methods are available for preparing liposomes as described 
in,e.g.; U.S. PatentNos. 4,186,183; 4,217,344; 4,235,871; 4,261,975; 4,485,054; 
4,501,728; 4,774,085; 4,837,028; 4,235,871; 4,261,975; 4,485,054; 4,501,728; 
4,774,085; 4,837,028; 4,946,787; PCX PublicationNo. WO 91/17424; Szokae^a/. 

15 (1980) Ann, Rev, Biophys, Bioeng, 9:467; Deamer ei al (1976) Biochim, Biophys, 
Acta 443:629-634; Fraley, etaL (1979) Proc, Natl Acad, Sci, USA 76:3348-3352; 
Hope et al. (1985) Biochim, Biophys. Acta 812:55-65; Mayer et al. (1986) Biochim, 
Biophys, Acta 858:161-168; Williams etal (1988) Proc. Natl Acad. Set USA 
85:242-246; Liposomes, Ostro (ed.), 1983, Chapter 1); Hope et al (1986) Chem, 

20 Phys. Lip, 40:89; Gregoriadis, Liposome Technology (1984) and Lasic, Liposomes: 
from Physics to Applications (1993). Suitable methods include, for example, 
sonication, extrusion, highpressure/homogenization, microfluidization, detergent 
dialysis, calciimi-induced fusion of small liposome vesicles and ether-fusion methods, 
all of which are well known in the art. 

25 In c^in embodiments, it may be desirable to target a liposome using 

targeting moieties that are specific to a particular cell type, tissue, and the like. 
Targeting of liposomes using a variety of targeting moieties (e.g., ligands, receptors, 
and monoclonal antibodies) has been previously described. See, U.S. Patent 
Nos. A,9S1J1Z and 4,603,044. 

30 Examples of targeting moieties include monoclonal antibodies specific to 

antigens associated with neoplasms, such as prostate cancer specific antigen and 
MAGE. Tumors can also be diagnosed by detecting gene products resulting from the 
activation or over-expression of oncogenes, such as ras or c-erbBl, In addition, many 
tumors express antigens normally expressed by fetal tissue, such as the 
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alphafetoprotein (AFP) and carcinoembryonic antigen (CEA). Sites of viral infection 
can be diagnosed using various viral antigens such as hepatitis B core and surface 
antigens (HBVc, HBVs) hepatitis C antigens, Epstein-Barr virus antigens, human 
immunodeficiency type-1 virus (HIV-1) and papilloma virus antigens. Inflanmiation 
5 can be detected using molecules specifically recognized by surface molecules which 
are expressed at sites of inflammation such as integrins (e.g., VCAM-1), selectin 
receptors (e.g., ELAM-1) and the like. 

Standard methods for coupling targeting agents to liposomes are used. These 
methods generally involve the incorporation into liposomes of lipid components, e,g,, 

1 0 phosphatidylethanolamine, which can be activated for attachment of targeting agents, 
or incorporation of derivatized lipophilic compounds, such as lipid derivatized 
bleomycin. Antibody targeted liposomes can be constmcted using, for instance, 
liposomes which incorporate protein A. See Renneisen et al (1990) 7. Biol Chem. 
265:16337-16342 and Leonetti etaL (1990) Proc. Nad, Acad Set USA 87:2448- 

15 2451. 

Pharmaceutical compositions and administration 

Fusion molecules as disclosed herein, and expression vectors encoding these 
polypeptides, can be used in conjunction widi various methods of gene therapy to 
facilitate the action of a therapeutic gene product. In such applications, the fusion 
molecule can be administered directly to a patient, e.g., to facilitate the modulation of 
gene expression and for therapeutic or prophylactic applications, for example, cancer, 
ischemia, diabetic retinopathy, macular degeneration, rheumatoid arthritis, psoriasis, 
HIV infection, sickle cell anemia, Alzheimer's disease, muscular dystrophy, 
neurodegenerative diseases, vascular disease, cystic fibrosis, stroke, and the like. 
Examples of microorganisms whose inhibition can be facilitated through use of the 
methods and compositions disclosed herein include pathogenic bacteria, e,g.. 
Chlamydia, Rickettsial bacteria, Mycobacteria, Staphylococci, Streptococci, 
Pneumococci, Meningococci and Conococci, Klebsiella, Proteus, Serratia, 
Pseudomonas, Legionella, Diphtheria, Sahnonella, Bacilli (e.g., anthrax). Vibrio (eg., 
cholera), Clostridium (e.g,, tetanus, botulism). Yersinia (e.g., plague), Leptospirosis, 
and Borrellia (e.g., Lyme disease bacteria); infectious fimgus, e.g., AspergilluSy 
Candida species; protozoa such as sporozoa (e.g., Plasmodia), rhizopods (e.g., 
Entamoeba) and flagellates {Trypanosoma^ Leishmanial Trichomonas^ Giardia, 
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e/c.);viruses, e.g., hepatitis (A, B, or C), herpes viruses (e.g., VZV, HSV-1, HHV-6, 
HSV-n, CMV, and EBV), HIV, Ebola, Marburg and related hemorrhagic fever- 
causing viruses, adenoviruses, influenza viruses, flaviviruses, echoviruses, 
rhinoviruses, coxsackie viruses, comaviruses, respiratory syncj'tial viruses, mumps 
5 viruses, rotaviruses, measles viruses, rubella viruses, parvoviruses, vaccinia viruses, 
HTLV viruses, retroviruses, lentiviruses, dengue viruses, papillomaviruses, 
polioviruses, rabies viruses, and arboviral encephalitis viruses, etc. 

Administration of tlierapeutically effective amounts of a localization domain- 
DNA-binding domain fusion molecule, a localization domain-DNA-binding domain- 

10 regulatory doomain fusion or a nucleic acid encoding these fusion polypeptides is by 
any of the routes normally used for introducing polypeptides or nucleic acids mto 
ultimate contact with the tissue to be treated. The polypeptides or nucleic acids are 
administered in any suitable manner, preferably in a pharmaceutically acceptable 
carrier. Suitable methods of administering such modulators are available and well 

1 5 known to those of skill in the art, and, although more than one route can be used to 
administer a particular composition, a particular route can often provide a more 
immediate and hiore effective reaction than another route. 

Pharmaceutically acceptable carriers are determined in part by the particular 
composition being administered, as well as by the particular method used to 

20 administer the composition. Accordingly, there is a wide variety of suitable 

formulations of pharmaceutical compositions. See, e.g.. Remington's Pharmaceutical 
Sciences, if ^ed. 1985. 

Fusion polypeptides or nucleic acids, alone or in combination with other 
suitable components, can be made into aerosol formulations {Le., they can be 

25 **nebuHzed") to be administered via inhalation. Aerosol formulations can be placed 
into pressurized acceptable propellants, such as dichlorodifluoromethane, propane, 
nitrogen, and the like. 

Formulations suitable for parenteral administration, such as, for example, by 
intravenous, intramuscular, intradermal, intracardiac and subcutaneous routes, include 

30 aqueous and non-aqueous, isotonic sterile injection solutions, which can contain 
antioxidants, buffers, bacteriostats, and solutes that render the formulation isotonic 
with the blood of the intended recipient, and aqueous and non-aqueous sterile 
suspensions that can include suspending agents, solubilizers, thickening agents, 
stabilizers, and preservatives. Compositions can be administered, for example, by 
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intravenous infusion, orally, topically, intraperitoneally, intrapluerally, intravesically 
or intrathecally. The formulations of compounds can be presented in unit-dose or 
multi-dose sealed containers, such as ampoules and vials. Injection solutions and 
suspensions can be prepared from sterile powders, granules, and tablets of the kind 
5 known to those.of skill in the art. 

Applications 

The compositions and methods disclosed herein can be used to modulate a 
number of cellular processes. These include, but are not limited to, transcription, 

10 replication, recombination, repair, integration, maintenance of telomeres, and 
processes involved in chromosome stability and disjunction. Accordingly, the 
methods and compositions disclosed herein can be used to affect any of these 
processes, as well as any oHier process which are influenced by localization domain 
fusion molecules and their effects on gene expression, intranuclear localization and 

15 chromatin structure. 

In preferred embodunents, a localization domain/DNA-binding domain fusion 
is used to achieve targeted repression of gene expression. Targeting is based upon the 
specificity of the DNA-binding domain. In another embodiment, a localization 
domain/DNA-binding domain/transcriptional activation domam fusion is used to 

20 achieve reactivation of a developmentally-silenced gene. In additional embodiments 
a localization domain/DNA-bindmg domain/chromatin remodeling complex 
component fusion is used to remodel the chromatin structure of a repressed gene 
located in a heterochromatic nuclear compartment, to allow access of transcriptional 
activators, either endogenous or exogenous. In these embodiments, additional 

25 molecules, exogenous and/or endogenous, can be used to facilitate repression or 

activation of gene expression. The additional molecules can also be fusion molecules, 
for example, fusions between a DNA-binding domain and a functional domain such 
as an activation or repression domain or a component of a chromatin remodeling 
complex. 

30 Accordingly, expression of any gene in any organism can be modulated using 

the methods and compositions disclosed herein, including therapeutically relevant 
genes, genes of infecting microorganisms, viral genes, and genes whose expression is 
modulated in the process of target validation. Such genes include, but are not limited 
to. vascular endothelial growth factor (VEGF), VEGF receptors fit and flk, CCR-5, 
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low density lipoprotein receptor (LDLR), estrogen receptor, HER-2/neu, BRCA-1, 
BRCA-2, phosphoenolpyruvate carboxykinase (PEPCK), CYP7, fibrinogen, 
apolipoprotein A (ApoA), apolipoprotein B (ApoB), renin, phosphoenolpyruvate 
carboxykinase (PEPCK), CYP7, fibrinogen, nuclear factor kB (NF-kB), inhibitor of 
5 NF-kB (I-kB), tumor necrosis factors (e.g. , TNF-a, TNF-p), interleukin- 1 (IL- 1 ), 
FAS (CD95), FAS ligand (CD95L), atrial natriuretic factor, platelet-derived factor 
(PDF), amyloid precursor protein (APP), tyrosinase, tyrosine hydroxylase, P-aspartyl 
hydroxylase, alkaline phosphatase, calpains (e.g., CAPNIO) neuronal pentraxin 
receptor, adriamycin response protein, apolipoprotein E (apoE), leptin, leptin receptor, 

10 UCP-1, IL-1, EL-l receptor, IL-2, IL-3, IL-4, IL-S, IL-6, IL-12, IL-15, interleukin 

receptors, G-CSF, GM-CSF, colony stimulating factor, erythropoietin (EPO), platelet- 
derived growth factor (PDGF), PDGF receptor, fibroblast growth factor (FGF), FGF 
receptor, PAF, pl6, pl9, p53, Rb, p21, myc, myb, globin, dystrophin, eutrophin, cystic 
fibrosis transmembrane conductance regulator (CFTR), GNDF, nerve growth factor 

15 (NGF), NGF receptor, epidermal growth factor (EGF), EGF receptor, transforming 
growth factors (e.g., TGF-a, TGF-p), fibroblast growth factor (FGF), interferons 
(e.g., EFN- a, IFN- p and IFN-y), insulin-related growth factor-1 (IGF-1), angiostatin, 
ICAM-1, signal transducer and activator of transcription (STAT), androgen receptors, 
e-cadherin, cathepsins (e.g., cathepsin W), topoisomerase, telomerase, bcl, bcl'2. Box, 

20 T Cell-specific tyrosine kinase (Lck), p38 mitogen-activated protein kinase, protein 
tyrosine phosphatase (hPTP), adenylate cyclase, guanylate cyclase, al neuronal 
nicotinic acetylcholine receptor, 5-hydroxytryptamine (serotonin)-2A receptor, 
transcription elongation factor-3 (TEF-3), phosphatidylcholine transferase, PTI-1, 
polygalacturonase, EPSP synthase, FAD2-1, A-9 desaturase, A-12 desaturase, A-15 

25 desaturase, acetyl-Coenzyme A carboxylase, acyl-ACP thioesterase, ADP-glucose 
pyrophosphorylase, starch synthase, cellulose synthase, sucrose synthase, fatty acid 
hydroperoxide lyase, and peroxisome proliferator-activated receptors, such as PPAR- 
y2. See also Science 291:1177-1351 (2001) and Nature 409:813-958 (2001). 

Expression of human, mammalian, bacterial, fungal, protozoal, Archaeal, plant 

30 and viral genes can be modulated. Viral genes include, but are not limited to, 

hepatitis virus genes such as, for example, HBV-C, HBV-S, HBV-X and HBV-P; and 
HTV genes such as, for example, tat and rev. Modulation of expression of genes 
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encoding antigens of a pathogenic organism can be achieved using the disclosed 
methods and compositions. 

Additional genes include those encoding cytokines, lymphokines, interleukins, 
growth factors, mitogenic factors, apoptotic factors, cytochromes, chemotactic 
5 factors, chemokine receptors (e.g., CCR-2, CCR-3, CCR-5, CXCR-4), phospholipases 
(e.g., phospholipase C), nuclear receptors, retinoid receptors, organellar receptors, 
hormones, hormone receptors, oncogenes, tumor suppressors, cyclins, cell cycle 
checkpoint proteins (e.g.,Chkl, Chk2), senescence-associated genes, 
immunoglobulins, genes encoding heavy metal chelators, protein tyrosine kinases, 

10 protein tyrosine phosphatases, tumor necrosis factor receptor-associated factors (e.g., 
Traf-3, Traf-6), apolipoproteins, thrombic factors, vasoactive factors, neuroreceptors, 
cell surfece receptors, G-proteins, G-protein-coupled receptors (eg., substance K 
receptor, angiotensin receptor, a- and p-adrenergic receptors, s^otonin receptors, and 
PAF receptor), muscarinic receptors, acetylcholine receptors, GABA receptors, 

15 glutamate receptors, dopamine receptors, adhesion proteins (e.g., CAMs, selectins, 
integrins and immunoglobulin superfamily members), ion channels, receptor- 
associated factors, hematopoietic factors, transcription factors, and molecules 
involved in signal transduction. Expression of disease-related genes, and/or of one or 
more genes specific to a particular tissue or cell type such as, for example, brain, 

20 muscle, heart, nervous system, circulatory system, reproductive system, genitourinary 
system, digestive system and respiratory system can also be modulated. 

Thus, the methods and compositions disclosed herein can be used in processes 
such as, for example, therapeutic regulation of disease-related genes, engineering of 
cells for manufecture of protein pharmaceuticals, pharmaceutical discovery (including 

25 target discovery, target validation and engineering of cells for high throughput 
screening methods) and plant agriculture. 
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EXAMPLES 

The following examples are presented as illustrative of, but not limiting, the 
claimed subject matter. 



5 Example 1 : Materials and Methods 

Cloning ofdMBD4ike and dMBD-IikeA 

dMBD-likeA was obtained as an EST clone (LD03777, 23) and the coding 
sequence amplified and cloned using a T/A cloning kit (Invitrogen) according to the 
manufacturer's directions. DNA sequencing confirmed the fidelity of amplification 
10 and the coding sequence was then subcloned into pET-21 a(+) (Novagen) using the 
Nhel-Xhol sites. 

dMBD-like expression clones were prepared by RT-PCR fi-om total 
Drosophila RNA using the following primer pair: 

MBDf: 5'-GGAATTGGGAATTGCGCTAGCATGAACCCGAGCGTCACAATC-3' 
15 (SEQ ID NO; 2); 

MBDr: 5'-GCGAATTCTGTCTTGAGTGCATCCTGCAGCTTTCGCGCAACTCCG-3' 
(SEQ ID NO: 3). 

PGR products were isolated on agarose gels, excised, reamplified and cloned mto the 
EcoRI-Nhel sites of pTYBl (NEB). Fidelity of reverse transcription and amplification 
20 was verified by DNA sequencing. 



Purification of recombimnt protein 

Recombinant dMBD-likeA was expressed in E. coli BL21 (DE3). 500 ml of 
LB were inoculated with 5 ml of an overnight culture and incubated at 37°C to Aeoo of 

25 0.7. Induction was performed by addition of isopropyl |3-tfaiogalactosidase to I mM 
and incubation at 37**C for 3 additional hours. Cells were harvested and resuspended 
in 10 ml of sonication buffer (20 mM Tris-HCl pH 8.0, 500 mM NaCl, 5 mM 
miidazole, 0.1% Nonidet-P-40 (NP-40), 1 mM 2-mercaptoe1hanol). Purification of 
the soluble His-tagged protein was performed with TALON resin (Clontech) 

30 according to the manufacturer's protocol. Protein was dialyzed versus 20 mM Tris- 
\ HCl, pH 8.0, 500 mM NaQ, 1 mM 2-mercaptoethanol, 2 mM MgCb- (Quantitation 
was performed using the BioRad protein assay. Recombinant dMBD-like was 
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prepared using the Impact CN System (New England Biolabs) according to the 
manufacturer's protocol 

Gel mobility-shift and southwestern assays 
5 Gel mobility shifts were performed in 10% polyaciylamide gels run in 0.5X 

TBE buffer (45 mM Tris, pH 8,0, 45 mM boric acid, 1 mM EDTA) using GAC12 or 
GAM12 double-stranded oligonucleotide probes, essentially as described in Wade et 
al. (1999) Nature Genet 23:62-66. One picomole of radiolabelled probe was mixed 
with purified recombinant protein as indicated in the figure legends in 10 mM Tris 
10 HCl pH 8.0, 3 mM MgCfe, 50 mM NaCl, 0.1 mM EDTA, 0.1% NP-40, 2 mM DTT, 
5% glycerol and 0.4 mg/ml BSA. The samples were incubated for 30 min at 37 ^C. 
30 pmoles of conapetitor DNA (GAC12 or GAM12) were used per binding reaction. 
Gels were scanned on a Phosphorlmager (Molecular Dynamics). The procedure used 
for southwestern assays is as described in Wade et al. (1999), supra. 

15 

Antibodies, immunoblats and immunoprecipitations 

Protein samples were resolved by SDS-PAGE and transferred to Inunobilon-P 
membrane (Millipore) following the manufacturer's reconmiendations. The 
Drosophila MBD-Iike antibodies were elicited in rabbits by subcutaneous injection of 

20 recombinant dMBD-likeA by Covance Laboratories, Inc. For immunoprecipitations, 
antibodies were immobilized on Protein A beads (Pierce) and subsequently incubated 
with 100 p.g nuclear extract, or 100 fig of the BioRex 0.5 M firaction for 2 h at 4 ""C 
with rotation. The beads were washed three times with Buffer C (0.1 M NaCl, see 
below) and analyzed by histone deacetylase or ATPase assays. NP-40 (0.01%) was 

25 included in all buffers. 

Histone deacetylase and ATPase assays 

Chicken histone octamers (20.25 nmoles) were acetylated using recombinant 
yeast HATlp and ^H-acetyl CoA (5.3 nm) followed by a cold Acetyl-CoA (100 nm) 
30 chase for complete acetylation. Deacetylation of the samples was carried out in a 
reaction (200 ^1) containing 25 mM Tris pH 8.0, 50 mM NaCl, 1 mM EDTA, 10% 
glycerol, and 1 ]xg ^H-histone octamer substrate. Reactions were incubated for 30 min 
at 30 ''C, were terminated by adding 50 (xl stop solution (0.1 M HCl and 0.16 M 
HAc), and extracted with 600 jil ethyl acetate. 450 }i\ of the organic layer was 
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counted in a liquid scintillation counter. Released acetate is indicated in the figures as 
cpm. 

In an ATPase assay, samples were incubated with y-^^"^^ATP in the absence or 
presence of chicken erythrocyte mononucleosomes for 30 min at room temperature. 
5 Reactions were spotted on PEI-cellulose thin-layer chromatography plates and 

developed in 1 M formic acid and 0.5 M LiCl. ATP hydrolysis was quantitated using 
a Phosphorlmager (Molecular Dynamics) with Image Quant Software. 

Fractionation of dMBD-like-containing complexes 

10 S2 cells were grown in suspension culture in Grace's Insect media 

supplemented with 10% heat-treated fetal bovine serum (FBS). Cells were 
centrifiiged at 5,000 xpm for 10 min. The pellet was resuspended in 40 ml of Buffer 
A (10 mM Hepes pH 7.5, 15 mM KCl, 2 mM MgClj, 0. 1 mM EDTA, 1 mM DTT, 0.5 
mM PMSF) and 2.7 ml of Buffer B (50 mM Hepes pH 7.5, 1 M KCl, 30 mM MgClz, 

15 0.1 mM EDTA, 1 mM DTT, 0.5 mM PMSF) and centrifuged at 5,000 rpm for 10 min. 
The pellet was resusp^ded in 10 ml of Buffer A and re-centrifuged as above. Cells 
were resuspended in 20 ml of Buffer A and homogenized with a Dounce 
homogenizer. 1 .5 ml of Buffer B were added and the mixture was again homogenized 
for a few additional strokes. The homogenate was c^trifuged at 8,000 rpm for 8 min. 

20 Nuclei were resuspended m 20 ml of buffer A and homogenized with 6-8 strokes. 2 
ml of 4 M (NH4)2S04 were added. The suspension was rotated for 30 min and 
centrifuged at 10,000 rpm for 30 min. The supernatant was dialyzed versus Buffer C 
(100 mM KCl, 20 mM Hepes pH 7.5, 1 mM EGTA, 1.5 mM MgClj, 1 mM PMSF, 
0.5 mM DTT, 10% glycerol) and cleared in a SW50.1 rotor at 40,000 rpm for 60 

25 minutes. 

The dialyzed extract was loaded onto a BioRex70 (Na*) (BioRad) column 
equilibrated with Buffer C at 10 mg protein per ml packed bed volume (cv), washed 
with 3 cv Buffer C (0.1 M), and step eluted with BuiBFer C (0.5 M). The 0.5 M 
fraction containing all the detectable dMBD-IikeA was fractionated over MonoQ 
30 HR5/5 (Pharmacia Biotech) in a 20 cv linear gradient from Buffer C (0.1 M) to Buffer 
C (1 M) and collected in 0.5 ml fractions. All fractions were analyzed by immunoblot 
and histone-deacetylase assay. The fraction containing the peak of dMBD-likeA was 
dialyzed, centrifuged, and fractionated on a Superose6 HRlO/30 gel filtration column 
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(Pharmacia Biotech). All fractions were analyzed by immiinoblottiag and HDAC 
assay. Antisera to Mi-2 and Mtal were described elsewhere (Wade et al., 1999, 
supra) as were SINS and RPD3 antisera. 



5 Northern blot analysis 

RNA was prepared from staged embryos, larvae, or adult flies using the Trizol 
reagent (Life Technologies) according to the manufacturer's directions. 10 fig total 
RNA was loaded per lane, resolved on 1 .2% agarose gels containing formaldehyde, 
transferred to nylon, and hybridized with a random-primed probe corresponding to flie 
10 coding sequence of dMBD-likeA. 



Cotransfection assays 

Drosophila S2 cells were grown in Schneider's medium (Sigma) at 27*C 
containing 10% FBS and penicillin/streptomycin. Cells were transfected using the 

1 5 Superfect reagent (Qiagen) following the manufacturer's directions. Transfection 

assays included 1.5 ^g of either the pGsDEsflcLuc reporter or the p-37tlcRLuc internal 
control, essentially as described in Chen et al. (1998) MoZ Cell Biol 18:7259-7268. 
The Gal4 DNA binding domain constructs, pACTIN-SV-Gal4-Gro, pACTIN-SV- 
Gal4-MBD and pACTIN-SV-GaM-MBDA were constructed by msertion of the Gal4 

20 DBD nxto the HindlllZPstt sites of pACTIN-SV followed by insertion of Gro, dMBD- 
like and dMBD-likeA into the EcoRI site of the resulting plasmid. Quantities of 
individual Gal4 plasmids were varied as described in the figure legends. The total 
amount of plasmid was normalized to 4 jiig by addition of pACTIN-S V vector (Huynh 
et al. (1999) J, Mol Biol 288:13-20) as carrier. Cells were harvested 12 h after 

25 transfection and luciferase assays were performed using an Enhanced Luciferase 
Assay Kit (Pharmingen) according to the manufacturer's instructions. Where 
indicated, transfected cells were treated with Trichostatin A (TSA, Wako) for 24 h 
before harvest 



30 Polyteiie chromosome staining 

Polytene chromosome squashes and staining were performed on Canton*S 
flies as described in Zink and Paro (1989) Nature 337:468-471 and Westwood (1991) 
Nature 353:822-827. Briefly, salivary glands were dissected in PBS and were placed 
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directly in fixative containing 3.7% formaldehyde, 45% acetic acid for 1 min prior to 
squashing. The spreads were stained with a-dMBD-like (1:200) followed by Alexa 
594-conjugated donkey a-goat IgG (1 :400) (Molecular Probes). DNA was visualized 
with 4,6-dianiidino-2-phenyhndole (DAPI; 1:1000). Control spreads stained with 
5 pre-immune serum, at an equivalent concentration to that indicated above, showed no 
staining. Each staining experiment was performed multiple times. 

Example 2: Identification of a Drosophila MBD family 

A search for Drosophila sequences similar to vertebrate methyl CpG binding 

10 proteins (MBD's) yielded multiple candidates (Figure lA). The Drosophila proteins 
are similar to vertebrate MBD proteins only in the putative methyl CpG binding 
domain with the exception of dMBD-like. The solution structure of this motif has 
been solved for MeCP2 (Wakefield et al. {1999) J Mol Biol 291:1055-1065) and for 
MBDl (Ohki et al. (1999) EMBO J. 18:6653-6661). It consists of a wedge-shaped 

15 structure composed of four antiparallel P-strands on one face and an a-helix and 
hairpin loop on the other (Rubin et al. (2000) Science 287:2204-2215). 

The sequences of the putative Drosophila MBD proteins were compared with 
those of their vertebrate counterparts, focusing on residues critical to the structure of 
the methyl CpG binding domain. Two uncharacterized products of the Drosophila 

20 genome project, CG10042 and CG12196 (Adams et al (2000) Science 287:2185- 
2195), and the product of the six-banded gene (sba, Zeidler et al (1997) Biol Chem. 
378: 1119-1 124) contain most of these sequence features. Specifically, the regions 
corresponding to the foiur beta strands are well conserved, including hydrophobic 
residues (Fig 1 A) proposed to be crucial for integrity of the fold (Ohki et al, supra). 

25 . There is some variation in the number of amino acids between loop L2 and the hairpin 
loop, alfliough the vCTtebrate MBD family members also differ in this aspect. Basic 
residues that constitute a charged surface on one side of the vertebrate MBD 
structures are also well conserved (Fig 1 A). Finally, two hydrophobic residues 
critical to the structure of the hairpin loop are also present. An important difference 

30 between the Drosophila and vertebrate proteins occurs in the loop LI, located 
between p-strands two and three (Fig 1 A). In vertebrate methyl CpG binding 
proteins, the spacing between strands P2 and |33 is invariant and the amino acid side 
chains in this loop are very similar. In contrast, the Drosophila proteins have 
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variations both in lengtli and in side chain chemistry in this loop (Fig 1 A). This 
region of MBDl and MeCP2 undergoes a conformational change upon binding to 
methylated DNA and is implicated as crucial for protein-DNA interaction. It seems 
likely that alterations in this region of the protein will abolish methyl CpG binding 
5 activity. 

Drosophila MBD-like (dMBD-like) was previously identified as a sequence 
relative of vertebrate MBD2 and MBD3 (Zhang et al. (1999) Genes Dev 13:1924- 
1935; Tweedie et al. (1999) Nature Genet 23:329-390). It is similar to MBD2 and 
MBD3 throughout its lengtii (Fig IB) and is encoded by a single gene (Flybase ID 
10 FBgn0027950, 1). Two mRNAs are generated from this locus, one of 11 15 bases, a 
second of 842 bases (Rubin et al. (2000) Science 287:2222-2224). The protein 
products of these alternatively spliced mRNAs differ in the amino acids encoded by 
exon 2 (Fig IB). 

The two methyl CpG binding domain protein homologs were named dMBD- 
1 5 like (product of the 1 1 1 5 base mRNA) and dMBD-likeA (product of the 842 base 
mKNA). Both dMBD-like isoforms share extensive sequence similarity with the 
recently described forms of MBD3 from Xenopus (Wade et al. (1999) Nature Genet 
23:62-66), particularly the third exon of the Drosophila gene (Fig IB). However, 
there are several gaps and non-conserved amino acids in the region corresponding to 
20 the MBD (Fig IB). The two dMBD-like proteins have an opa-like repeat (Wharton et 
al. (1985) Cell 40:55-62) inserted in the loop between strands P2 and p3 (Fig IB) - 
this region is predicted to interact with DNA (Wakefield et al. (1999) JMolBiol 
291 :1055-1065). In addition, dMBD-like lacks the distal portion of the a-helix 
making up one face of the wedge (Fig. IB). The shorter isoform, dMBD-likeA, 
25 completely lacks the fourth ^-strand, the a-helix, and the haiipin loop. Finally, there 
are numerous amino acid changes at positions predicted to be crucial for DNA 
interaction and structural integrity of the domain. Thus, it seemed unlikely that either 
protein would bind methylated DNA. 



30 Example 3 : dMBD-Mke fails to bind methylated DNA 

The two dMBD-like proteins were expressed in bacteria, purified, and their 
binding properties were compared to Xenopus MBD3, a protein previously 
demonstrated to bind selectively to methylated DNA (Wade et al (1999), supra). In 
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Southwestern assays using immobilized protein, neither isoform interacted with the 
probes, regardless of methylation status {Fig IC). Thus, either the proteins fail to 
bind or they are unable to refold on the membrane surface. 

Solution interactions wifli DNA were also examined using an electrophoretic 
5 mobility shift assay (Fig ID). Neither of the Drosophila proteins bound, under 
conditions where Xenopus MBD3 bound methylated DNA selectively. In fact, no 
binding was observed for the dMBD-like isoforms even after reducing tlie 
concentrations of cold competitor DNA up to 50 fold, resulting in a mass excess of 
radiolabelled probe over cold competitor (see Example 1). No interaction was 
10 detected under these conditions, even non-specific aggregation. Therefore, the results 
indicate that neither dMBD-like isoform binds DNA, in keeping with the lack of 
genome-wide methylation in Drosophila. 

Example 4: dMBD-likeA associates with HDAC and nucleosome-stimulated 

15 ATPase activities 

The sequence similarity between exon three of dMBD-hke and vertebrate 
MBD2 and MBD3 (Yao et al. (1993) Nature 366:476-479) implies conservation of 
function. One potential role for this region is interaction with other proteins. As 
Drosophila contains homologs of many proteins known to be components of HDAC- 

20 containing corepressor complexes in other systems (Rubin et al. (2000) Science 
287:2204-2215 and Adams et al. (2000) Science 287:2185-2195), the presence of 
dMBD-like as a compon^t of such a complex (or coniplexes) in Drosophila was 
examined. 

a-dMBD-like polyclonal antisera were used to investigate potential 
25 interactions between dMBD-like and known components of corepressor complexes. 
Immunoblot analysis confirmed that the antisera recognized both isoforms of dMBD- 
like (Fig 2A). Interestingly, only the shorter isoform, dMBD-likeA, was detected in 
nuclear extracts from S2 cells {Big 2A). Immunoprecipitations were then performed 
from S2 nuclear extracts and flie precipitated proteins were assayed for enzymatic 
30 activities associated with corepressor complexes. Immune serum, but not preimmune 
serum, efiBciently precipitated histone deacetylase activity (Fig 2B) and also 
precipitated ATPase activity (Fig 2C). Like vertebrate and Drosophila Mi-2 (Chen et 
al (1999) Genes Devel 13:2218-2230; Guschin et al. (2000) Biochemistry 39:5238- 
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5245) the precipitated ATPase activity was stimulated by nucleosomes (Fig 2C). 
These results indicate thatMBD-likeA is associated with an undefined histone 
deacetylase and a nucleosome-stimulated ATPase in S2 nuclei, suggesting inclusion 
of dMBD-likeA in a Drosophila Mi-2-like corepressor complex. 
5 The relationship of dMBD-likeA with other protems in S2 cells using classical 

biochemical techniques was also examined. Nuclear extracts were fractionated using 
ion exchange and gel filtration chromatography; dMBD-likeA was assayed in the 
fractions by immunoblot. All the detectable dMBD-likeA bound the BioRex 70 
column and was eluted at 0.5 M NaCL Gradient elution of the BioRex 70 pool on 

10 MonoQ yielded a major peak of deacetylase activity, precisely coeluting with SIN3 
and RPD3 (Fig 3A). The peak of dMBD-likeA by immunoblot was resolved from the 
peaks of HDAC activity, RPD3, and SIN3 by a smgle fraction (Fig 3 A). Tlie peak 
fraction of dMBD-likeA from the MonoQ column was further purified using a 
Superose 6 gel filtration column. The peaks of dMBD-likeA, Mi-2, and MTAl-like 

15 (Wade et al (1 999) Nature Genet 23:62-66) coeluted at a position consistent with a 
molecular mass of approximately 1 MDa (Fig 3B). SIN3, RPD3, and p55 (Zeidler et 
al, supra; Martinez-Balbas etai (1998) Proc. Natl Acad. Set USA 95:132-137), the 
Drosophila RbA p48/p46 homolog, also closely, but not precisely, coeluted. Thus, 
dMBD-likeA copurifies with Mi-2 and MTAl-like, consistent with its inclusion in a 

20 Drosophila Mi-2 complex similar to that observed in vertebrates. 

Example 5: dMBD-like represses transcription when tethered near a promoter 

A transcription assay, essentially as described in Chen et aL (1999) Genes 
Devel 13:221 8-2230 and Chen et al. (1 998) Mol Cell Biol 1 8:7259-7268, was used to 

25 assess the consequences of tethering the dMBD-like isoforms at a promoter. Both 

dMBD-like and dMBD-hkeA were fused to the Gal4 DNA binding domain (DBD). A 
Gal4-Groucho fusion (see, Chen et al (1998) and Chen et ah (1999), both supra) and 
the Gal4 DBD were used as positive and negative controls for transcriptional 
repression, respectively (Fig 4A). Gal4 fusions to dMBD-like, dMBD-likeA and 

30 Groucho mediated dose-dq)endent transcriptional repression following transfection 
into S2 cells (Fig 4B). Control experiments showed that repression required a Gal4 
site in fte reporter plasmid (Fig 4B, lower) and that transfection of dMBD-like or 
dMBD-likeA lacking the Gal4 DBD failed to repress transcription. The expression 
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levels of the transfected Gal4 fusion proteins were equivalent by immunoblot (Fig 
4C). Further, the repression seen with the two tethered dMBD-like isoforms was 
similar to that observed with the well-characterized repressor Groucho (Fig 4B). 

If dMBD-like represses transcription through recruitment of a Drosophila Mi- 
5 2 complex, transcriptional repression should be sensitive to inhibitors of histone 
deacetylases. Accordingly, the effect of Trichostatin-A, a histone deacetylase 
inhibitor, on dMBD-like-mediated repression was tested. Repression mediated by 
tethering of dMBD-like or dMBD-likeA was largely relieved by Trichostatin-A 
(TSA); this effect was qualitatively very similar to that of TSA on Groucho-mediated 
10 repression (Fig 4D). 

In sum, the results indicate that both isoforms of dMBD-like function as 
transcriptional corepressors through recruitment of histone deacetylase activity, 
consistent with the proposed function of the Drosophila Mi-2 complex. 

1 5 Example 6: Developmental expression profile of dMBD-like 

Northern analysis was used to ascertain the expression patterns of the two 
dMBD-like isoforms during Drosophila development (Fig. 5). Two transcripts, 
corresponding to the splice variants of dMBD-like, were present in early embryos 
(Fig 5 A). Merestiogly, the 0-2 hour embryos had only the longer mRNA. Levels of 

20 this mRNA decline precipitously after 12 hours of embryonic development and 

remain undetectable in larval stages and adult males; however this mKNA is present 
in adult females, possibly due to maternal mKNA in the ovary. Protein expression 
patterns partially mimic the mRNA expression data (Fig SB). Again, only dMBD-like 
was observed in 0-2 hour embryos while both dMBD-like and dMBD-likeA were 

25 present in 12-24 hour embryos. In the final embryonic stage and in the first larval 
stage, only the dMBD-likeA isoform was present. Neither protein isoform was 
detectable in the remaining larval stages or in adults, despite the presence of their 
mRNAs (Fig 5A). 

30 Example 7: dMBD-like associates with heterochromatin and a small number of 
euchromatic sites in polytene chromosomes 

To investigate target gene specificity of dMBD-like, its distribution on 
Drosophila salivary gland polytene chromosomes from third instar larvae was 
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examined. Polytene chromosomes are ideal for this analysis since they are thought to 
reflect the biochemical and structural properties of chromatin of diploid interphase 
cells and their large size allows for the identification of individual chromosomal sites 
(Hill et aL (1987) hit Rev Cytol 108:6M 18). At this developmental stage, only the 
5 shorter mRNA was detected, corresponding to dMBD-likeA (see Example 6). 
To visualize the banding pattern of the polytene chromosomes, the 
chromosomes were counterstained with DAPI, which stams brightest in the 
condensed, banded regions of euchromatin and the constitutively condensed 
heterochromatin at the chromocenten Immunofluorescence staining with the antibody 

10 to dMBD-like revealed preferential association with 29 euchromatic sites as well as 
weaker association with ~100 euchromatic sites and centric heterochromatin. In 
addition to localization to a set of discrete sites within the euchromatic chromosome 
arms, dMBD-likeA is present at the chromocenter, a region of constitutive 
heterochromatin. No staining was observed with preimmune sera or at the 

15 hunchback-regulated Ubx locus. Interestingly, 69% of the predominant sites 

correspond to developmental puffs that are transcriptionally induced by pulses of the 
steroid hormone 20-hydroxyecdysone (ecdysone) during the late larval and prepupal 
periods (Ashbumeret al. {1911) Results Probl CeU Differ A'AQl-lSl), 

These binding studies indicate fliat dMBD-likeA is a component of an Mi-2 

20 complex in flies and not a component of a SIN3 containing complex. Furtlier, the Mi- 
2 complex in flies appears to have a role in either establishment or maintenance of 
heterochromatin. 

Although disclosure has been provided in some detail by way of illustration 
25 and example for the purposes of clarity of understanding, it will be apparent to those 
skilled in the art that various changes and modifications can be practiced wi&out 
departing from the spirit or scope of the disclosure. Accordingly, the foregoing 
descriptions and examples should not be construed as limiting. 
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CLAIMS 

What is claimed is: 

1. A method of compartmentalizing a region of interest in cellular 
chromatin, the method comprising contacting the region of interest 
with a composition that binds to a binding site in cellular chromatin, 
wherein the binding site is in a gene of interest and wherein the 
composition comprises a localization domain or functional fragment 
thereof and a DNA binding domain or functional fragment thereof. 

2. The method of claim 1, wherein the composition is a fusion molecule. 

3. The method of claim 1 or claim 2, wherein the DNA binding domain 
comprises a zinc finger DNA-binding domain. 

4. The method of any of claims 1 to 3, wherein the region of interest is 
compartmentalized into a nuclear compartment for packaging as 
heterochromatin. 

5. The method of any of claims 1 to 4, wherein the cellular chromatin is 
present in a plant cell. 

6. The method of any of claims 1 to 4, wherein the cellular chromatin is 
present in an animal cell. 

7. The method of claim 6, wherein the cell is a human cell. 

8- The method of any of claims I to 7, wherein the localization domain is 
a methyl CpG binding domain or a functional fragment thereof 

9. The method of claim 8, wherein the methyl CpG binding domain is 
selected from the group consisting of MECPl, MECP2, MBDl, 
MBD2, MBD3, MBD4, dMBD-like and dMBD-likeA, and functional 
fragments thereof. 

10. The method of claim 1 or claim 2, wherein the DNA-binding domain 
composes a triplex-forming nucleic acid or a minor groove binder. 

11. The method of any of claims 1 to 10, wherein compartmentalization 
facilitates modulation of expression of a gene associated with tiie 
region of interest. 
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12. The method of claijn 1 1, wherein compartmentalization facilitates 
repression of expression of a gene associated with the region of 
interest. 

13. The metliod of claim 2, wherein the fusion molecule is a polypeptide. 

5 14. The method of claim 13, wherein the method further comprises the 

step of contacting a cell with a polynucleotide encoding the 
polypeptide, wherein the polypeptide is expressed in the cell. 

15. The method of any of claims 1 to 14, wherein the gene encodes a 
product selected from the group consisting of vascular endothelial 

10 growth factor, erythropoietin, androgen receptor, PPAR-y2, pl6, p53, 

pRb, dystrophin and e-cadherin. 

16. The method of claim 8, wherein the methyl CpG binding domain or 
functional fragment is related to a gene involved in a disease state 
selected from the group consisting of ICF syndrome, Rett syndrome 

1 5 and Fragile X syndrome. 

17. A method of modulating expression of a gene, the method comprising 
the step of contacting a region of DNA in cellular chromatin with a 
fusion molecule that binds to a binding site in cellular chromatin, 
wherein the binding site is in the gene and wherein the fusion molecule 

20 comprises a DNA binding domain or functional fragment thereof and a 

localization domain or functional fragment thereof. 

18. The method of claim 1 7, wherein modulation comprises repression of 
expression of the gene. 

19. The method of clami 1 7 or claim 1 8, wherein the DNA-binding 

25 domain of the fusion molecule comprises a zinc finger DNA-binding 

domain. 

20. The method of any of claims 17 to 19, wherein the DNA binding 
domain binds to a target site in a gene encoding a product selected 
from the group consisting of vascular endothelial growth factor, 

30 erythropoietin, androgen receptor, PPAR-y2, pi 6, p53, pRb, dystrophin 

and e-cadherin. 
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21. The method of any of claims 17 to 20, wherein the localization domain 
is a methyl CpG binding domain or a functional fragment thereof. 

22. The mediod of claim 21, wherein the methyl CpG binding domain is a 
polypeptide selected &om the group consisting of MECPl, MECP2, 

5 MBD 1 , MBD2, MBD3, MBD4, dMBD-like and dMBD-hkeA and 

functional fragments thereof. 

23. The method of any of claims 17 to 22, wherein flie gene is in a plant 
cell. 

24. The method of any of claims 17 to 22, wherein the gene is in an animal 
10 cell. 

25. The method of claim 24, wherein the cell is a human cell. 

26. The method of any of claims 1 7 to 25, wherein the fusion molecule is a 
polypeptide. 

27. The method of claim 26, wherein the method further comprises the 
15 step of contacting the cell with a polynucleotide encoding the 

polypeptide, wherein the polypeptide is expressed in the cell. 

28. A method of modulating expression of a gene, the method comprising 
the step of contacting a region of DNA in cellular chromatin with a 
fusion molecule titiat binds to a binding site in cellular chromatin, 

20 wherein the binding site is in the gene and wherein the fusion molecule 

comprises a DNA binding domain, a localization domain and a 
regulatory domain, or functional fragments thereof. 

29. The method of claim 28, wherein modulation comprises repression of 
expression of the gene. 

25 30. The method of claim 28, wherein modulation comprises activation of 

expression of the gene. 

31. The method of any of claims 28 to 30, wherein the DNA-binding 

domain of the fusion molecule comprises a zinc finger DNA-binding 
domain. 
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32. The method of claim 30, wherein the regulatory domain comprises an 
activation domain or a functional fragment thereof. 

33. The method of claim 32, wherein the activation domain comprises VP- 
16, p65, or functional fragments thereof 

5 34. The method of claim 29, wherein the regulatory domain comprises a 

repression domain or a functional fragment thereof. 

35. The method of any of claims 28 to 34, wherein the regulatory domain 
comprises a component of a chromatin remodeling complex. 

36. The method of any of claims 28 to 35, wherein the DNA binding 
10 domain binds to a target site in a gene encoding a product selected 

from the group consisting of vascular endothelial growth factor, 
erythropoietin, androgen receptor, PPAR-.Y2, pI6, p53, pRb, dystrophin 
and e-cadherin. 

37. The method of any of claims 28 to 36, wherein the localization domain 
15 is a methyl CpG binding domain or a ftmctional fragment thereof. 

38. The method of claim 37, wherein the methyl CpG binding domain is a 
polypeptide selected from the group consisting of MECPl, MECP2, 
MBDl, MBD2, MBD3, MBD4, dMBD-like, dMBD-likeA and 
ftmctional fragments thereof. 

20 39. The method of claim 28, wherein a plurality of fusion molecules is 

contacted with cellular chromatin, wherein each of the fusion 
molecules binds to a distinct binding site. 

40. The method of claim 39, wherem at least one of the fijsion molecules 
comprises a zinc finger DNA-bmding domain. 

25 41 . The method of claim 39, wherein the expression of a plurality of genes 

is modulated. 

42. The method of any of claims 28 to 41, wherein the gene is in a plant 
cell. 

43. The method of any of claims 28 to 41, wherein the gene is in an animal 
30 cell. 
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44. The method of claim 43, wherein the cell is a human celL 

45. The method of any of claims 28 to 44, wherein the fusion molecule is a 
polypeptide. 

46. The metliod of claim 45, wherein the method further comprises the 
5 step of contacting the cell with a polynucleotide encoding the 

polypeptide, wherein the polypeptide is expressed in the cell. 

47. The method of claim 17, wherein a plurality of fusion molecules is 
contacted with cellular chromatin, wherein each of the fusion 
molecules binds to a distinct binding site. 

10 48. Hie method of claim 47, wherein at least one of the fusion molecules 

comprises a zinc finger DNA-binding domain. 

49. The method of claim 47, wherein the expression of a plurality of genes 
is modulated. 

50. A fusion polypeptide comprising: 

15 a) a localization domain or functional fragment thereof; and 

b) a DNA binding domain or functional fragment thereof. 

51. The polypeptide of claim 50, wherein the DNA-binding domain is a 
zinc finger DNA binding domain, 

52. The polypeptide of claim 50 or claim 5 1 , wherein the localization 
20 domain is a methyl CpG binding domain, 

53. The polypeptide of claim 52, wherein the methyl CpG binding domain 
is a polypeptide selected from the group consisting of MECPl, 
MECP2, MBDl, MBD2, MBD3, MBD4, dMBD-like, dMBD-likeA 
and functional fragments thereof. 

25 54. The polypeptide of any of claims 50 to 53, wherein the DNA binding 

domain binds to a target site in a gene encoding a product selected 
from the group consisting of vascular endothelial growfli factor, 
erythropoietin, androgen receptor, PPAR-72, pi 6, p53, pRb, dystrophin 
and e-cadherin. 



75 



PCTAJSOl/42377 

55* A polynucleotide encoding the fiision polypeptide of any of claims 50 
to 54. 

56. A cell comprising the fusion polypeptide of any of claims 50 to 54. 

57. A cell comprising the polynucleotide of claim 55. 

58. The fusion polypeptide of any of claims 50 to 54, further comprising: 

(c) a regulatory domain or functional fragment thereof 

59. The polypeptide of claim 58, wherein the regulatory domain comprises 
an activation domain. 

60. TTie polypeptide of claim 59, wherein the activation domain comprises 
VP- 16, p65 or functional fragments thereof. 

61. The polypeptide of claim 58, wherein the regulatory domain comprises 
a repression domain. 

62. The polypeptide of claim 58, wherein the regulatory domain comprises 
a component of a chromatin remodeling complex. 

63. A polynucleotide encoding the fusion polypeptide of claim 58. 

64. A cell comprising the fusion polypeptide of claim 58. 

65. A ceU comprising the polynucleotide of claim 63. 
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